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I.  INTRODUCTION 

This  document  contains  a  summary  of  the  work  accomplished  under  Grant 
AFOSR  83-0166  during  the  time  period  18  May  1985  through  June  30,  1986. 

Section  II  contains  a  summary  of  the  work  accomplished  to  date  under  the  current 
year  of  funding.  This  summary  is  supplemented  by  appendices. 

Section  III  is  devoted  to  various  administrative  matters  pertinent  to  the  grant. 

n.  WORK  ACCOMPLISHED 

(a)  Optical  Interconnections 

Optical  interconnections  has  been  an  area  of  investigation  under  AFOSR  support 
for  several  years.  The  powerful  interconnect  abilities  of  optical  beams  have  led  many 
to  believe  that  one  of  the  most  important  roles  for  optics  in  computing  in  the  future 
will  be  as  an  interconnect  technology. 

The  focus  of  our  efforts  in  this  area  has  been  on  the  use  of  holographic  optical 
elements  for  providing  such  interconnects.  During  the  past  contract  year  our  accom¬ 
plishments  have  been  two-fold:  (1)  An  analytic  comparison  of  optical  and  electronic 
interconnects  in  the  problem  of  chip-to-chip  communication  (published  in  Applied 
Optics,  with  a  reprint  attached  as  Appendix  A  to  this  report);  and  (2)  A  very  detailed 
investigation  of  holographic  optical  elements  and  their  capabilities  in  the  role  of  inter¬ 
connect  elements  (results  presented  at  the  1985  Annual  Meeting  of  the  OSA,  and  in 
more  detail  as  an  invited  paper  at  the  OSA  Topical  Meeting  on  Holography,  April 
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1986).  Since  the  results  obtained  in  the  first  of  these  two  areas  are  found  in  the 
appendix,  we  discuss  in  more  detail  only  the  second  area  above. 

Of  particular  interest  in  the  interconnect  problem  is  the  diffraction  efficiency  that 
can  be  achieved  with  a  holographic  optical  interconnect  element,  as  well  as  the  ability 
of  that  element  to  efficiently  concentrate  light  onto  a  small-area  photodetector.  We 
have  developed  a  ray-trace  program  that  accounts  not  only  for  the  density  of  rays  in 
^  the  image  space  (as  do  most  conventional  ray-trace  programs)  but  also  the  diffraction 

efficiency  associated  with  each  of  the  rays,  thus  enabling  us  to  obtain  image  irradiance 
profiles  at  the  detector  plane.  The  predictions  of  this  program  have  been  extensively 
ve.lfied  experimentally  using  bleached  silver  halide  emulsions.  The  approach  is 
sufficiently  general  that  the  effects  of  fan-out  on  diffraction  efficiency  can  be  included, 
an  important  issue  in  interconnect  problems.  The  holograms  studied  are  generally 

; 

1  reflection  elements  with  focusing  power.  The  diffraction  efficiency  associated  with 

each  ray  is  determined  from  coupled  mode  theory,  using  another  program  developed 
expressly  for  that  purpose. 

One  Ph.D.  student  will  be  completing  his  degree  this  July  in  this  area.  A  major 
publication  on  this  material  has  been  submitted  recently  to  Applied  Optics. 

During  the  year  we  have  also  been  devoting  attention  to  more  fundamental 
aspects  of  optical  interconnections,  especially  the  issues  of  fan-in  and  fan-out.  A  paper 
has  been  published  by  Optica  Acta  on  this  subject  and  is  attached  as  Appendix  B.  In 
addition,  an  extensive  survey  paper  on  optical  interconnections  has  been  under  prepara¬ 
tion  and  will  soon  be  submitted.  However,  much  remains  to  be  done  of  a  fundamental 
nature  in  understanding  the  proper  place  for  optical  interconnects  in  a  hierarchy  of 


interconnect  technologies. 


(b)  Defect  Enhancement  using  Four-Wave  Mixing. 

During  the  past  contract  year  we  have  fully  developed  and  brought  to  a  conclu¬ 
sion  our  ideas  on  the  use  of  four-wave  mixing  or  phase  conjugation  as  a  means  for 
enhancing  defects  in  periodic  structures.  Such  defect  enhancement  is  needed  in  the 
|  testing  of  integrated  circuit  photomasks,  as  well  as  in  other  inspection  problems 

involving  periodic  structures. 

Our  early  work  was  devoted  to  the  problem  of  intensity  inversion  using  an 
inherent  nonlinear  property  of  the  phase-conjugation  process  in  photorefractives.  This 
work  was  published  in  Applied  Optics  (Vol.  24,  pp.  1826- 1832,  1985).  Following  this 
work,  we  applied  the  method  to  defect  detection  in  periodic  structures,  with  the  results 
being  published  in  Optics  Letters  (see  Appendix  C  for  a  reprint). 

A  Ph.D.  candidate  finished  her  work  on  this  topic  in  the  Summer  of  1985  and  is 
now  employed  in  industi^.  A  patent  application  has  been  filed  on  the  method.  No 
further  work  in  this  area  is  planned,  since  it  is  ready  for  commercialization. 

(c)  Optimal  Imaging  Concentrators 

During  the  past  three  years  we  have  used  a  small  part  of  our  AFOSR  funds  to 
support  supervision  time  of  a  U.S.  Air  Force  Captain  at  Stanford  in  a  Ph.D.  program. 
This  individual  has  now  completed  his  Ph.D.  thesis  in  the  area  of  optimal  imaging 
concentrators,  i.e.  imaging  system  configurations  that  will  maximally  deliver  light  (of 
an  arbitrary  state  of  partial  coherence)  to  a  prescribed  detector  array  of  arbitrary 
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geometrical  configuration.  The  research  is  highly  theoretical  in  nature,  but  has  direct 
applications  to  both  optical  interconnections  and  to  high-energy  lasers.  The  early 
results  of  the  work  were  reported  at  the  1985  Annual  Meeting  of  the  Optical  Society 
of  America.  At  about  the  same  time  a  full-length  technical  paper  was  submitted  to 
JOSA-A  for  the  special  issue  on  Coherence  and  Statistical  Optics.  We  expect  this  paper 
to  be  published  within  the  next  month  or  two.  The  Ph.D.  student  working  in  this  area 
will  be  completing  his  final  requirements  this  summer. 

(d)  Neural  Networks  and  Optical  Computing 

During  the  past  contract  year  we  have  undertaken  research  in  a  new  area  that  we 
feel  is  very  exciting  and  promising,  namely  the  application  of  neural  network  ideas  to 
problems  of  optical  computing.  There  is  a  multitude  of  researchers  who  are  currently 
looking  at  such  networks  as  a  possible  means  for  realizing  associative  or  content- 
addressable  memories.  In  view  of  the  substantial  efforts  in  this  area  elsewhere,  we 
have  chosen  instead  to  focus  on  the  application  of  such  ideas  to  computing. 

For  six  months  during  1985  we  were  fortunate  to  have  as  a  visitor  with  our  group 
Prof.  Mitsuo  Takeda  from  the  University  of  Electro-communications,  in  Tokyo.  Under 
our  encouragement,  Dr.  Takeda  began  an  investigation  in  this  area  in  collaboration 
with  us,  and  results  that  we  feel  are  very  significant  were  obtained.  To  summarize  in  a 
few  words,  we  investigated  the  application  of  the  Hopfield  neural  network  model  to 
the  following  computational  problems: 

1.  The  "Hitchcock"  problem,  which  is  a  transportation  problem  or  a  resource  alloca¬ 
tion  problem. 
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2.  Matrix  inversion  and  image  deblurring  problems. 

3.  Signal  processing  problems,  including  spectral  analysis. 

The  results  of  these  investigations  reveated  some  interesting  points  that  require 
further  investigation: 

1.  For  most  (but  not  all)  problems,  the  most  direct  solution  was  one  that  mixed  the 
"program"  and  the  "data"  in  a  single  interconnect  pattern. 

2.  For  many  (but  not  all)  problems,  the  computational  load  associated  with  determi¬ 
nation  of  the  required  interconnect  pattern  is  comparable  with  the  computational 
load  associated  with  direct  solution  of  the  problem. 

3.  For  most  problems,  constraints  must  be  properly  weighted  with  respect  to  the 
energy  function  to  be  minimized,  requiring  rather  ad  hoc  and  empirical  choices. 

In  view  of  the  importance  we  place  on  this  work,  both  with  respect  to  work 
accomplished  and  work  proposed,  we  are  attaching  a  preprint  of  the  paper  to  this 
report  as  Appendix  D  (in  spite  of  its  bulk).  This  work  has  been  accepted  for  publica¬ 
tion  in  the  Applied  Optics  special  issue  on  number  representations  in  optical  comput¬ 
ing. 

III.  ADMINISTRATIVE  MATTERS 

This  section  contains  miscellaneous  information  pertinent  to  the  grant. 

Publications  on  work  fully  or  partially  supported  by  this  grant  and  accepted  or 
published  during  the  last  contract  year  are  as  follows: 
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(1)  E.  Ochoa,  L.  Hesselink,  J.  W.  Goodman,  "Real-time  intensity  inversion  using 
two-wave  and  four-wave  mixing  in  photorefractive  Bil2GeO20",  APPLIED 
OPTICS,  Vol.  24,  pp.  1826- 1832  (1985). 

(2)  R.K.  Kostuk,  J.W.  Goodman,  L.  Hesselink,  "Optical  imaging  applied  to 
microelectronic  chip-to-chip  interconnections",  APPLIED  OPTICS,  Vol.  24,  No. 
17,  pp  2851-2858  (1985). 

(3)  E.  Ochoa,  J.W.  Goodman,  L.  Hesselink,  "Real-time  enhancement  of  defects  in  a 
periodic  mask  using  photorefractive  Bl2Si012",  OPTICS  LETTERS,  Vol.  10,  pp. 
430-432  (1985). 

(4)  J.W.  Goodman,  "Fan-in  and  Fan-out  with  optical  interconnections",  OPTICA 
ACTA,  Vol.  32,  No.  12,  1489-1496  (1985). 

(5)  J.W.  Goodman,  R.K.  Kostuk,  and  B.  Clymer,  "Optical  interconnects:  an  over¬ 
view",  Proceedings  of  the  IEEE  Conference  on  Multilevel  Interconnects  for  VLSI, 
Santa  Clara,  California,  June  1985,  pp.  219-224. 

(6)  J.W.  Goodman,  "A  random  walk  through  the  field  of  speckle",  Optical  Engineer¬ 
ing,  May  1986. 

Papers  under  submission  include: 

(1)  P.  Idell  and  J.W.  Goodman,  "Design  of  optimal  imaging  concentrators  for  par¬ 
tially  coherent  sources:absolute  encircled  energy  criterion".  Accepted  for  publica¬ 
tion  in  JOSA-A. 

(2)  M.  Takeda  and  J.W.  Goodman,  "Neural  networks  and  computing:  number 
representations  and  programming  complexity”,  Accepted  for  publication  in 


APPLIED  OPTICS. 
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Thiee  contributed  papers  were  presented  at  the  1985  Annual  meeting  of  the  OSA. 
An  invited  paper  was  presented  at  the  Workshop  on  Optical  Interconnects  sponsored 
by  MCC  in  Austin  Texas  in  November  1985.  An  plenary  paper  was  presented  at 
LASER  85  in  Los  Vegas,  Nevada  in  December  1985.  An  invited  paper  entitled 
"Holographic  optical  elements  for  optical  interconnects"  was  presented  at  the  OSA 
Topical  Meeting  on  Holography,  Honolulu,  Hawaii,  April  1986.  An  invited  paper  enti¬ 
tled  "Optical  interconnects"  was  presented  at  the  NoF  Workshop  on  Lightwave  Tech¬ 
nology,  Tucson,  AZ,  May  1986.  An  invited  paper  entitled  "Optical  interconnections 
and  computing"  was  presented  at  the  US- Japan  Workshop  on  Optoelectronics,  Tokyo, 
Japan,  May  1986. 
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An  imaging  system  is  proposed  as  an  alternative  to  metallized  connections  between  integrated  circuits. 
Power  requirements  for  metallized  interconnects  and  electrooptic  links  are  compared.  A  holographic  opti¬ 
cal  element  is  considered  as  the  imaging  device.  Several  experimental  systems  have  been  constructed  which 
have  visible  LEI).-,  as  the  transmitters  and  PIN  photodiodes  as  the  receivers.  Signals  are  evaluated  at  differ¬ 
ent  source-detector  separations.  Multiple  exposure  holograms  are  used  as  a  means  of  optical  fan  out  allow  ¬ 
ing  one  source  to  simultaneously  address  several  receiver  locations.  Limitations  of  this  technique  are  also 
discussed. 


I.  Introduction 

A  limitation  of  increasing  importance  in  VLSI  elec¬ 
tronic  integrated  circuit  design  is  the  interconnections 
between  devices  and  systems.  Restrictions  of  con¬ 
ventional  interconnects  arise  from  (a)  increased  space 
allocated  to  wiring,  (b)  propagation  delays  with  in¬ 
creased  line  lengths  and  RC  time  constants,  (c)  induc¬ 
tive  noise  between  lines,  (d)  dominance  of  line  capaci¬ 
tance  over  other  sources  of  capacitance  as  line  lengths 
increase,  and  (e)  degrading  electromigration  effects  on 
wiring  materials.1  :1  Since  different  optical  signals  can 
propagate  through  the  same  spatial  volume  without 
interference,  the  possibility  of  using  optical  methods  to 
alleviate  this  space  restriction  is  attractive.4-'1  In  this 
paper  we  discuss  a  number  of  aspects  of  optical  imaging 
which  are  applicable  to  the  electronic  interconnect 
problem  and  evaluate  an  experimental  system. 

II.  Comparison  of  Optical  and  Electronic 
Interconnections 

Figure  1  shows  a  typical  VLSI  microelectronic  circuit 
mounted  and  bonded  to  a  package  which  can  he  con¬ 
nected  to  other  electronic  systems.  There  are  several 
thousand  gates  on  this  circuit  and  several  hundred 
output  pins  which  allow  communication  to  other  sys¬ 
tems.  Two  levels  of  interconnection  can  be  identified: 
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One  connects  two  or  more  devices  on  a  common  chip, 
and  another  connects  an  integrated  system  or  chip  to 
another  chip. 

There  are  a  number  of  ways  to  compare  the  perfor¬ 
mance  and  capability  of  different  types  of  intercon¬ 
nection.  'e*  Consider  one  such  criterion,  the  reactive 
power  required  of  one  electronic  inverter  to  trigger  an¬ 
other  inverter.  Reactive  power  is  given  by 


where  C  is  the  capacitance  of  the  line  and  attached  de¬ 
vices,  V  is  the  device  threshold  level  (assumed  1  V),  and 
r  is  the  clocking  period  (assumed  1  nsec). 

Figure  2  illustrates  gate-to-gate  connection.7  The 
gate  capacitance  of  two  devices  and  the  metal  line 
connecting  them  aust  be  charged  to  the  threshold  po¬ 
tential  for  the  gate.  The  gate  capacitance  is  given  by 


where  <r  =  3.9  for  SiCL,  (,,  =  8.854  X  10-4  F/cm.  A  is  the 
device  area,  and  d  is  the  oxide  thickness  layer.  Pro¬ 
jected  VLSI  device  lengths  and  oxide  layer  thickness  are 
0.5  and  0.02  gmi,  respectively.  This  gives  a  gate  ca¬ 
pacitance  of  Ce  =  50  fF/device.  The  capacitance  of  the 
line  joining  two  devices  is 


where  l  is  the  line  length  and  w  the  linewidth.  The 
width/height  ratio  is  restricted  by  fringing  field  effects 
to  a  minimum  value  of  ~2.  For  a  typical  VLSI  circuit 
the  average  length  is  ~1  mm  long.  This  gives  a  line 
capacitance  of  ('/  =  70  IT.  The  total  capacitance  of  this 
link  is  then  =  2(\.  +  (’/  =  170  IT,  and  the  corre¬ 
sponding  reactive  power  =  85  p\V. 

1  September  1985  /  Vol.  24.  No.  17  /  APPLIED  OPTICS  2851 


KiH-  I  Vl.Sl  circuit  ( numu fact u red  by  Honeywell)  with  — ;UHH)  gates 
and  I  i*  bomlinu  p.id>.  Interconnections  exist  between  grates  on  a 
i  nm nil'll  substrate  and  trom  bonding  pads  to  other  circuits  and  out¬ 
side  systems. 


Fig.  2.  Schematic  -if  galeto-gate  connection  tor  two  inverters.  The 
line  between  gates  is  modeled  as  a  single  capacitor. 

Figure  11  shows  a  chip-to-chip  connection.  To  mini¬ 
mize  propagation  delays,  gate  capacitances  are  gradu¬ 
ally  increased  in  size  until  the  device  capacitance  is 
comparable  to  that  of  a  bonding  pad.'  A  voltage  pulse 
from  a  logic  element  must  have  sufficient  power  to 
charge  these  gates,  two  bonding  pads,  the  line  con¬ 
necting  them,  and  a  receiving  gate  to  the  device 
threshold  level.  The  total  capacitance  of  this  link  is  C, 
=  20,  +  O  +  where  C’t,  is  the  bonding  pad  capac¬ 
itance.  For  a  pad  area  of  ~100  pm-  and  assuming  a 
SiOz  dielectric,  this  capacitance  is  ~0.4  pF.  Lines 
connecting  the  pads  are  25  7/m  in  width  and  are  assumed 
to  he  500  gra  above  the  ground  plane.  When  a  number 
of  chips  are  connected  on  the  same  substrate,  a  typical 
length  separating  a  nearby  pair  is  of  the  order  of  1  cm. 
At  this  distance  transmission  line  standing  wave  effects 
are  not  significant  (i.e..  A  =  HO  cm). 

The  line  capacitance  in  this  case  is  only  4.5  IT.  The 
total  capacitance  becomes  C,  =  0.8  pF  +  0.0045  pF  + 
0.1  pF  =  0.9  pF  and  the  switching  power  P,  =  430 
aW. 

Next  consider  a  simple  electrooptic  link  consisting 
of  a  semiconductor  source  and  detector.  Initially  it  is 
assumed  that  all  the  light  from  the  source  is  focused  on 
the  detector.  'File  detector  circuit  model  is  shown  in 
Fig.  I  'Plie  current  generated  is  a  function  of  t he 
physical  parameters  of  the  junction  and  the  illumina- 
t  ion." 
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Fig.  4.  Detector  circuit  model.  'Fhe  space-charge  region  d  tin- 
junction  results  in  a  capacitance  shunting  a  photon  induced  current 
source.  The  series  resistance  is  typically  a  tew  ohms  and  can  be  rn 
fleeted.  The  parallel  resistance  is  of  the  order  oi  10'*  <>  and  can  be 
assumed  to  be  an  open  circuit. 


where  ip  is  the  photocurrent,  T  is  the  optical  flux.  <7  i- 
the  electronic  charge,  r  is  the  Fresnel  reflection  coeffi¬ 
cient  of  the  detector  surface,  hv  is  photon  energy . is 

the  semiconductor  absorption  coef  ficient  at  A.  and  :  is 
the  absorption  width.  Typical  responsivity  for  a  silicon 
device  is  0.4  A/W. 

The  usual  condition  of  low  series  and  large  shunt  re¬ 
sistance  simplifies  the  model  to  a  capacitance  shunting 
a  current  source.  Current  from  the  detector  must 
charge  the  gate  to  its  threshold  level  in  a  time  less  than 
the  clocking  period  r.  If  no  preamplifier  is  assumed, 
all  current  must  originate  from  electrons  generated  from 
the  incident  optical  flux  <t>.  For  a  2-7/m  thick,  25-7/m 
square  active  area  detector,  the  junction  capacitance  is 
Cj  -  32.5  fF.  Since  the  detector  must  charge  the  ca¬ 
pacitance  of  a  gate,  the  total  capacitance  is  C,  =  O/  + 
(\.  =  82.5  fF.  For  a  threshold  voltage  of  ~1  V, 


V  =  QIC,. 


t  =  1  nsec. 

With  200 7/W  of  incident  optical  power.  80  7/A  of  current 
can  be  generated  in  the  detector  and  can  produce  80  fC 
of  charge.  This  is  sufficient  to  produce  the  1-Y 
threshold  value.  Assuming  a  laser  diode  electrical  to 
optical  conversion  efficiency  of  30r<,  the  electrooptic 
link  will  require  • — (470  7/W  of  electrical  power.  (A  large 
fraction  of  the  power  needed  to  drive  a  diode  is  not  re¬ 
active.  The  important  consideration  here  is  the  amount 
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of  power  required  to  iransmit  comparable  optical  and 
electrical  signals.  In  the  electrical  case  with  FET  type 
devices,  the  power  is  primarily  reactive  in  nature,  while 
the  optical  system  also  requires  a  real  power  component. 
The  consequence  of  this  difference  is  not  significant  at 
this  level  of  analysis.  It  would  need  consideration  if 
heat  dissipation  effects  were  investigated.) 

These  first-order  considerations  indicate  that,  with 
currently  available  electrooptic  technology,  the  power 
required  for  an  electrooptic  link  is  of  the  same  order  of 
magnitude  as  that  necessary  for  the  electrical  ehip- 
to-chip  interconnection  and  would  not  suffer  from  the 
problems  previously  outlined  for  conventional  inter¬ 
connections.  The  electrooptic  link  compares  less  fa¬ 
vorably  with  gate-to-gate  connections  on  the  same 
chip. 

III.  Optical  Chip-to-Chip  Layout 

The  chip-to-chip  interconnect  problem  can  be  for¬ 
mulated  in  more  specific  terms  as  shown  in  Fig.  5.  One 
or  more  integrated  systems  are  mounted  on  a  common 
substrate  separated  by  distances  of  ~1  cm.  As  men¬ 
tioned  previously,  at  these  lengths  and  frequencies  of 
1  GHz.  transmission  line  effects  are  not  significant. 
Bonding  pads  are  assumed  to  be  100-pm  square  and 
separated  by  100  pm.  Several  hundred  bonding  pads 
must  be  connected.  Each  transmission  point  should 
be  able  to  address  several  receiver  locations;  it  is  also 
desirable  for  channels  to  cross  without  interference. 

An  imaging  system  can  provide  this  connection 
mechanism.  Consider  the  arrangement  of  Fig.  6.  A 
semiconductor  emitter  illuminates  a  holographic  optical 
element  coded  to  distribute  radiation  to  one  or  more 
image  points.  Photodiodes  convert  optical  to  electrical 
signals,  which  are  then  decoded  by  a  digital  electronic 
circuit. 

Advantages  of  using  holographic  elements  include 
their  adaptability  to  decentered  layouts  by  using  off- 
axis  recording  geometries  and  to  fan  out  by  using  se¬ 
quentially  exposed  multiple  holograms. 

A  number  of  factors  must  be  considered  in  a  practical 
system  of  this  type.  The  most  attractive  sources  and 
detectors  are  those  made  from  materials  which  are 
compatible  with  integrated  electronics.  Semiconductor 
sources  developed  for  optical  communications  have 
emission  wavelengths  from  780  nm  to  1.6  pm.  To  date 
only  a  few  holographic  recording  materials  are  respon¬ 
sive  at  these  wavelengths  and  these  are  not  very  sensi¬ 
tive/1 

Other  considerations  are  the  emission  profile  and 
polarization  characteristics  of  the  source.  Laser  diodes 
have  an  emission  profile  corresponding  to  the  diffrac¬ 
tion  pattern  of  the  junction  geometry.  Planar  stripe 
junction  diodes  have  transverse  mode  divergence  angles 
which  have  typical  values  of  60°  by  1 0°.  Therefore  only 
a  portion  of  the  volume  above  the  source  will  be  illu¬ 
minated.  The  hologram  need  only  occupy  this  region 
above  the  source  to  be  effective. 

The  polarizations  of  these  two  directions  are  or¬ 
thogonal.  Kogelnik"’  has  shown  that  polarization 
vectors  oriented  in  the  plane  of  incidence  of  the  grat  ing 
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Fiji.  Y  (leomplrical  layout  of  a  ihip-to  chip  connection.  Two  in¬ 
tegrated  circuits  are  mounted  on  a  common  substrate  with  /.  =  1  .*» 
cm.  u  *  1  cm.  and  bonding  pad  widths  and  >epnration*  =  Ion  ^m. 


Fig.  h.  Imaging  system  for  chip-to-chip  communication.  Fight 
emitting  sources  and  detectors  replace  transmitting  and  receiving 
bonding  pads.  A  hologram  is  used  as  the  imaging  element.  Design 
must  include  //No.  or  I/I)  ratio,  intensity  emission  profile  of  the 
source,  and  source-detector  separation. 


produce  a  reduced  coupling  constant  and  diffraction 
efficiency  which  results  in  lower  image  intensity. 

A  LED  is  also  a  potential  semiconductor  source.  It 
has  the  advantage  of  being  a  surface  emitter  and  is  much 
easier  to  fabricate  than  a  laser  diode.  In  addition  they 
can  be  made  to  emit  in  the  visible  by  introducing  traps 
in  the  hand  gap.  However,  they  are  inefficient  in 
comparison  to  laser  diodes  and  have  spectral  band- 
widths  of  ~20  nm.  Also  they  emit  unpolarized  light 
which  results  in  lower  diffraction  efficiency  for  the 
reason  mentioned  above.  Their  intensity  emission 
profile  is  cosinusoidal  in  angle  and  therefore  illuminates 
a  larger  region  of  a  hologram  than  would  a  laser  diode. 
Image  reconstructions  with  this  type  of  emission  profile 
are  brighter  when  the  hologram  occupies  large  solid 
angles  relative  to  the  source. 


IV.  Holographic  Optical  Element  Characteristics 

The  requirement  for  a  compact  system  implies  that 
the  element  must  have  a  small  //No.  This  also  im¬ 
proves  flux  collection.  The  meridional  angles  for  // 1 
and  //8. 5  elements  are  ‘26.5°  and  8.1°  in  air.  A  model 
for  diffraction  efficiency  must  be  valid  for  grating  vec¬ 
tors  covering  this  angular  range.  A  relatively  simple 
description  of  grafting  diffraction  efficiency  is  Kogel- 
nik’s  coupled  two-wave  treatment.10  The  expression 
of  efficiency  for  reflection  holograms  with  absorption 
is  given  by 


£/r  +  |l  +  cothlr-  +  £-) 1  - 


where  r;  is  the  diffraction  efficiency, 
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r M  is  the  refractive-index  modulation,  d  is  the  grating 
thickness,  c,  and  cr  are  obliquity  factors,  a  is  the  ab¬ 
sorption/length,  and  is  the  Rragg  angle.  The  planar 
grating  treatment  can  be  extended  to  curved  surfaces 
by  assuming  that  the  surface  is  locally  plane  in  t lie  re¬ 
gion  where  the  ray  intersects  the  grating.11 

A  number  of  planar  volume  phase  holograms  were 
formed  in  bleached  photographic  film.  The  thickness, 
refractive  index,  and  postbleached  absorption  were 
measured  to  obtain  average  values  for  these  parameters. 
The  results  were  then  used  in  Kogelnik’s  model  to 
predict  the  diffraction  efficiency  curves  and  were 
compared  with  measured  curves.  Although  slight 
changes  to  the  values  for  absorption  and  emulsion 
thickness  change  had  to  be  used,  the  agreement  was  very- 
good.  Figure  7  shows  two  measured  diffraction  effi¬ 
ciency  curves  from  gratings  with  K  orientations  ap¬ 
proximately  equal  to  the  meridional  angles  of  //I  and 
//To  systems.  High  diffraction  efficiency  is  maintained 
over  a  large  range  of  playback  angles.  The  holographic 
optical  element  (HOE)  field  of  view  is  essentially  this 
angular  range  and  is  ~30°  for  25°  grating  slant  angles 
and  60°  for  10°  slant  angles  corresponding  to  the  // 3.5 
system. 

A  single  grating  element  can  interconnect  a  number 
of  sources  and  their  conjugate  receiver  locations  over  the 
angular  range  of  high  efficiency.  When  source  recon¬ 
struction  coordinates  differ  significantly  from  formation 
positions,  hologram  image  aberrations  reduce  image 
irradiance.  Aberrations  can  be  evaluated  with  ray 
tracing  techniques.  For  thick  holograms  these  ex¬ 
pressions  may  be  derived  from  the  reflected  ray  com¬ 
ponents  which  are  perpendicular  and  tangent  to  the 
grating  vector:  r  =  (K-r)K  -  K  X  (K  X  r),  where  r  is 
a  unit  vector  along  the  reconstruction  ray,  and  K  is  the 
grating  vector  given  by  K  =  r„  -  rc,  and  r„  and  r(.  are 
unit  vectors  along  the  object  and  reference  ray  direc¬ 
tions,  respectively.  The  reconstructed  or  image  ray  is 
r'  =  -(K-r)K  -  KX(KXr). 

The  spot  diagram  generated  by  ray  tracing  should  be 
adjusted  for  the  variation  in  efficiency  at  different  lo¬ 
cations  in  the  aperture  of  the  volume  HOE.  However 
it  has  been  shown  that  a  close  relationship  exists  be¬ 
tween  the  observed  image  field  and  the  density  of  rays 
traced  through  the  element.12  Figure  8  shows  the  spot 
diagram  of  rays  from  a  source  point  displaced  0.5  cm 
perpendicular  to  the  axis  and  0.1  cm  along  the  axis  from 
the  source  formation  positions  for  //I  and  // 3.5  ele¬ 
ments.  It  is  clear  that  off-axis  imaging  degrades  much 
more  rapidly  for  smaller  //No.  HOEs. 

A  computer  program  coding  the  grating  equation  can 
be  used  to  generate  a  spot  diagram  at  any  desired  image 
plane.  When  used  in  conjunction  with  Kogelnik's  ef¬ 
ficiency  model,  both  the  aberrations  and  the  efficiency 
of  the  rays  forming  the  image  can  be  determined.  This 
gives  a  better  indication  of  the  distribution  of  flux  at  the 
receiver  location  and  the  detector  current  produced 
from  a  source  of  given  size,  output  power,  and  location 
relative  to  the  HOE.  Such  a  program  is  currently  under 
development  in  our  lab  for  use  with  multiple  image  re¬ 
flection  hologram  design. 
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Fig.  7.  Measured  diffraction  efficiency  curves  for  gratings  with  K 
approximating  those  formed  by  the  meridional  rays  in  iO>  .m  /  1 
system,  i.e..  2‘>°.  and  (Xian  system,  i.e.,  10°.  Significant  elf i- 
ciency  exists  over  an  angular  range  of  d0  -f)0°. 
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Fig.  8.  Spot  diagrams  for  f/\.  and  /VH.5  systems  with  a  reconstruction 
source  point  0.">  cm  from  the  axis  of  the  element  at  x  ~  0.  y  =  0. 
Computations  are  based  on  the  grating  vector  equation. 


The  effective  HOE  aperture  and  reflection  losses  also 
restrict  the  usable  source  power.  The  solid  angle  sub¬ 
tended  by  the  HOE  relative  to  a  source  point  is 


where  D  is  the  diameter  of  the  hologram  aperture.  0  is 
the  angle  from  the  source  point  to  the  center  of  the  ho¬ 
logram,  and  r  is  the  distance  from  the  source  point  to  the 
hologram  center. 

If  the  source  is  a  Lambertian  emitter,  the  flux  col¬ 
lected  bv  the  aperture  of  the  HOE  is  ‘1>  =  (/ o  cos  11)12. 
When  the  source  and  optical  element  are  on-axis,  12.5r< 
of  (he  available  source  power  is  collected  with  an  //I 
system  anu  only  ~1.0rl  for  an  // 3.5  system. 

If  the  hologram  recording  medium  is  not  index 
matched  to  the  source  and  detector  surfaces,  Fresnel 
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Km  i*.  Simplest  holographic  nmfigurat ion  for  imaging  interconnects. 
Eight  I  rum  a  puint  source’  is  imaged  to  a  diametrically  opposite  point. 
Several  M’lpiential  exposures  can  be  encoded  and  used  to  produce  an 
invariant  pattern  of  images.  This  can  be  used  for  invariant  fan-out 
configurations. 
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Fig.  10.  Combined  multifacet  hologram  and  variable  image  mask. 
A  separate  hologram  facet  is  formed  with  each  fan-out  pattern  en¬ 
coded  on  the  mask.  The  mask  and  hologram  are  translated  with  re¬ 
spect  to  each  other.  Each  hologram  is  formed  with  a  converging 
reference  wave  to  allow  playback  with  an  expanding  beam. 


reflect  ion  losses  also  reduce  the  flux  entering  the  grat¬ 
ing.  The  recording  medium  used  has  a  refractive  index 
of  1.64,  resulting  in  transmitted  intensities  ranging  from 
91  to  93%  for  incident  angles  of  0-30°.  Therefore  7-9% 
of  the  available  source  power  is  lost  by  reflection.  If  a 
fixed  amount  of  flux  'I’^i  is  required  at  the  detector, 
axially  located  sources  must  have  output  powers  <Pdt 
exceeding  this  value  by  'I*,,  =  M't’dt,  with  /u  =  1/(0.125 
*  0.08)  =  9  for  an  f/l  system,  and  n  =  1/(0. 01  *  0.08)  = 
109  for  an  f/3.b  system.  Therefore  considerable  power 
is  required  from  a  Lambertian  source  even  when  a  100% 
efficient  hologram  is  used. 

The  divergence  angle  from  a  laser  diode  is  approxi¬ 
mately  matched  to  the  meridional  angle  of  an  f/l  ele¬ 
ment  ( ~30°  for  the  laser  and  26°  for  the  optical  element. 
This  implies  that  all  the  power  from  a  laser  diode  can 
he  collected  by  a  smaller  aperture  than  for  a  Lambertian 
source.  A  laser  diode  can  therefore  have  much  lower 
input  power  and  still  produce  the  required  detector 
current  and  perform  the  switching  task. 

After  considering  the  above  HOE  and  semiconductor 
source  characteristics,  three  types  of  hologram  conf  ig¬ 
uration  appear  to  offer  a  solution.  The  first  arrange¬ 
ment  is  a  large  aperture  reflecting  lens  with  one  or 
multiple  gratings  (Fig.  9).  This  element  is  relatively 
easy  to  fabricate  and  position  and  uses  a  point  source 
for  reconstruction.  Multiple  grating  formation  allows 
a  single  reconstruction  source  to  address  several  loca¬ 
tions  simultaneously.  It  does  however  restrict  the  lo¬ 
cations  of  sources  and  detectors  to  positions  along  di¬ 
ameters  which  pass  through  the  optical  axis,  and  fan  out 
can  only  he  accomplished  in  an  invariant  pattern.  This 
restriction  may  preclude  this  arrangement  from  prac¬ 
tical  application  hut  it  is  important  for  optical  system 
evaluation.  The  second  and  third  configurations  utilize 
the  multifacet  or  aperture  partitioning  concept  recently 
discussed  by  Haugen  rt  al. 1:1  for  transmission  holograms 
and  requires  directed  beam  reconstruction  either  from 
a  laser  diode  or  a  directed  LEI)  emission  pattern.  In 
"lie  of  these  arrangements  a  mask  with  the  address 
pattern  serves  as  the  object  wave  and  a  converging  beam 
as  the  reference  wave  (Fig.  10).  This  method  has  the 
attractive  aspect  of  having  an  1C  compatible  technique 
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Fig.  11.  Multifacet  hologram  formed  with  selective  object  source 
points.  Source  points  are  encoded  in  sequential  fashion.  This  is  the 
most  flexible  configuration  but  also  the  most  difficult  to  implement. 


(i.e.,  mask  making)  used  for  generating  an  address 
pattern.  The  drawbacks  of  this  arrangement  are  the 
intermodulation  terms  which  limit  the  efficiency  of  the 
reconstruction  images.1 1  It  is  not  obvious  where  this 
becomes  restrictive  for  this  application.  In  the  last 
hologram  configuration  proposed  each  facet  is  illumi¬ 
nated  sequentially  with  a  number  of  diverging  object 
beams  and  a  converging  reference  wave  (Fig.  11).  The 
positions  of  the  object  beams  can  be  moved  automati¬ 
cally  with  a  computer-controlled  stepper  motor  drive 
and  beam  ratios  can  he  adjusted  for  maximum  diffrac¬ 
tion  efficiency.  This  configuration  appears  to  offer  the 
most  flexible  arrangement  for  fabricating  an  intercon¬ 
nect  pattern  since  it  satisfies  requirements  for  both  a 
large  number  of  independent  channels  and  spatially 
variant  fan  out.  The  difficulty  with  this  HOE  fabri¬ 
cation  technique  is  the  mechanical  complexity  of  the 
mount;  however  there  appears  to  be  no  fundamental 
restriction  to  its  implementation. 

V.  Experimental  Results 

To  evaluate  some  of  the  above  ideas  a  number  of  ex¬ 
perimental  systems  were  fabricated  and  tested.  Only 
the  first  hologram  design  described  above  is  discussed 
here.  The  other  two  hologram  types  will  he  presented 
in  Future  papers. 
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Table  I:  Source  and  Detector  Characteristics 


Sources 

Litronix 

Hewlett  -Packard 
(HP) 

Input  power 

155  mW 

140  m\V 

Output  power 

70  fi\V/ sir 

80  p\V/slr 

intensity  profile 

l.amlK'rtian 

Lambertian 

Size 

250  fiin* 

150  p  m~ 

^■p*ak 

660  nm 

605  nm 

A\ 

20  nm 

20  nm 

Detectors 

HP  KO  Coupler  PD 

HP4205PIN 

Size 

400  pm- 

200  ft m  (diameter! 

Kesponsivitv  (630  nm) 

0.1  AAV 

0.4  AAV 

The  effects  of  image  degradation  and  power  loss  were 
determined  by  mounting  a  number  of  sources  and  de¬ 
tectors  at  increasing  separations  and  measuring  the 
received  detector  photocurrent.  The  operating  char¬ 
acteristics  of  the  sources  and  detectors  are  given  in 
Table  I.  The  sources  are  surface  emitting  Gal*  LEDs. 
The  primary  reason  for  using  these  devices  is  their  peak 
emission  in  the  visible  (635  and  655  nm)  making  them 
compatible  with  a  number  of  available  holographic  re¬ 
cording  materials.  They  have  about  a  20-nm  spectral 
bandwidth  and  a  near  Lambertian  intensity  emission 
profile.  Their  main  disadvantage  is  their  poor  electrical 
to  optical  conversion  efficiency.  Measured  efficiency 
of  both  the  655-  and  655-nm  LEDs  is  ~0.5fr .  Sources 
and  detectors  used  were  in  chip  form  with  cross-sec¬ 
tional  dimensions  of  the  same  order  of  magnitude  as  t  tie 
size  of  the  bonding  pads  (see  Fig.  12). 

Two  source-detector  mounts  were  used.  On  the  first, 
the  devices  were  set  on  the  common  conducting  plane 
of  a  dual  in-line  IC  package.  'Phis  arrangement  allowed 
evolution  of  both  electrical  coupling  and  direct  optical 
scattering  on  the  detector  signal  received  from  the 
source  image.  The  second  mount  had  source  and  de¬ 
tector  on  different  substrates  and  was  optically  isolated 
to  allow  examination  of  the  effects  of  image  degradation 
and  aperturing  at  large  source-detector  separations. 

Figure  13  is  a  plot  of  the  ratio  of  photodiode  current 
with  the  image  of  the  source  focused  onto  the  detector 
to  the  current  with  the  image  focused  just  off  the  de¬ 
tector.  Response  with  source-detector  separations 
from  86  m m  to  4  mm  was  obtained  with  the  source  and 
detector  mounted  on  the  same  conducting  substrate.  It 
appears  that  optical  scattering  and  electrical  coupling 
greatly  reduce  the  effective  signal  response  at  separa¬ 
tions  <100  ^m.  At  separations  from  2  to  4  mm,  con¬ 
trast  ratios  increase  more  slowly  than  at  closer  separa¬ 
tions.  With  source  and  detectors  on  separate  substrates 
and  isolation  from  optical  scatter,  the  contrast  ratio 
improves  by  an  order  of  magnitude  at  1.0-cm  distances, 
t  hen  falls  by  a  factor  of  2  as  separation  increases  to  2  cm. 
The  falloff  at  larger  separations  results  from  aberrations 
which  reduce  image  irradiance. 

The  image  of  the  source  was  also  observed  on  a  CCD 
line  scanner  to  directly  evaluate  the  image  irradiance 
pattern.  Figure  14  shows  these  profiles  when  the 
635-nm  LED  is  0.45,  0.60,  1.00,  and  1.50  cm  from  the 
linear  scanner.  The  hologram  used  for  these  mea- 


Fig.  1'.!.  Litronix  l,KD  with  250 -p m-  emission  urea  ami  a  H» 
Packard  photodiode  from  an  electrooptic  coupler  with  Mi  pin 
area.  The  separation  of  the  two  chip*  is  --OOpm. 
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Fig.  Id.  Plot  of  the  ratio  of  photodiode  current  with  intake  lot  used 
on  the  detector  to  the  current  with  the  image  focused  off  the  detector. 
The  equipment  used  did  not  allow  measurements  with  source  de¬ 
tector  separations  from  1  to  10  mm;  t  x )  indicates  measurements  ob¬ 
tained  with  sources  and  detectors  on  the  same  substrate;  (O)  on 
separate  substrates. 


Fig.  14.  COD  line  scan  traces  of  images  of  the  6115-nm  LKD  produced 
with  the  HOK.  The  COD  has  250.  Fl-pm  elements.  (Kcillo 
scope  scale  is  TU)  pm  per  1  cm.  Source  001 )  separations  are  (aio.15 
cm;  t  b)  0.00  cm;  (e)  I  .IK)  cm;  and  (dll  .50  cm. 
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Ki^.  1'v  (a)  Schematic  of  hologram  cons!  ruction  arrangement  to  form 

a  multiple  image  with  a  single  reconstruction  source.  The  film  plane 
is  translated  through  fixed  construction  beams.  (I>)  The  resulting 
element  is  in  effect  a  set  of  reflecting  lenses  with  displaced  optical  axes 
which  image  the  source  relative  to  their  respective  axes.  The  lenses 
in  (Ip  are  shown  unfolded  for  clarity. 

surements  has  a  diameter  of  1 .5  cm  and  is  :i.  10  cm  from 
the  source-detector  plane.  The  ('CD  scans  indicate 
that  the  fallnff  in  effective  signal  response  at  large 
separations  results  from  an  increase  in  the  image  area 
and  a  corresponding  decrease  in  image  irradiance  illu¬ 
minating  the  detector. 

A  number  of  multiple  exposure  holograms  were  made 
to  examine  the  potential  of  optical  fan  out.  Elements 
were  made  with  the  arrangement  shown  in  Fig.  15. 
Converging  and  diverging  wave  fronts  overlap  to  form 
an  on-axis  reflecting  lens  type  hologram.  The  film 
plane  is  then  translated  in  this  overlap  region  to  form 
a  number  of  holographic  lenses  with  their  optical  axes 
displaced  by  the  amount  of  translation.  A  single  re¬ 
construction  source  has  a  different  displacement  from 
the  optical  axis  of  each  encoded  element  and  therefore 
images  the  source  at  a  different  position  in  space. 
Figures  l(i(a)  and  (l>)  show  images  produced  from  two 
such  elements.  In  the  first,  film  translations  of  0.7  by 
0.25  cm  were  used,  while  in  the  second  0.5-mm  move¬ 
ments  were  made.  Both  situations  give  well-resolved 
images  with  full  width  at  half-intensity  maxima 
( FVVHM)  of  — :i(H)  fjm.  The  I.KI)  emission  surface  is 
150  g<m  in  length. 

VI.  Conclusions  and  Future  Research 

Beactive  power  considerations  indicate  that  with 
current  electrooptic  technology  an  optical  chip-to-chip 
interconnect  requires  approximately  the  same  amount 
nl  power  to  transmit  high  speed  signals  as  electrical 
connect  inns  but  without  the  need  to  devote  large  sec¬ 
tions  ol  the  circuit  substrate  to  communication  chan¬ 
nel'-.  This  would  allow  the  use  of  more  input  mil  put 
ports  and  increase  I  he  informal  ion  capacity  of  t  he  !(’. 


(b) 

Fig.  Ifi.  (a)  Photograph  of  multiple  images  formed  with  an  element 
having O.'Jo-em  horizontal  and  0.7()-em  vertical  displacements  using 
a  I.KI)  reconstruction  source.  The  diode  is  1  cm  from  the  center  of 
the  image  pattern,  (hi  Photograph  of  a  ('CD  line  trace  of  the  I.KI) 
imaged  by  a  HOP,  with  three  otKI-pm  translations.  Scale  is  ;i:to  jum 
per  1  cm. 


It  could  also  reduce  electrical  coupling  difficulties  of 
conventional  interconnect  schemes.  The  chip-to-chip 
interconnect  can  he  recast  in  terms  of  an  optical  imaging 
system  with  semiconductor  sources  as  signal  transmit¬ 
ters  and  photodiode  detectors  as  receivers. 

The  diffraction  efficiency  characteristics  of  reflection 
volume  holograms  have  sufficient  angular  response  to 
accommodate  source-detector  separations  of  a  few 
centimeters.  These  separations  also  require  that  the 
holographic  element  he  located  a  comparable  distance 
above  the  circuit  substrate.  Other  practical  consider¬ 
ations  are  Fresnel  reflection  losses  and  flux  collection 
characteristics  of  a  particular  //No.  element  and  source 
emission  profile.  Serious  limitations  also  exist  in  the 
lack  of  compatibility  between  efficient  semiconductor 
sources  and  holographic  recording  materials.  A  match 
between  these  components  would  allow  use  of  much 
more  efficient  sources  and  greatly  improved  llux  col¬ 
lection  geometries. 

Initial  experiments  indicate  that  electrical  and  optical 
coupling  are  serious  problems  when  sources  and  de- 
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tecturs  are  <100  n m  apart  and  the  image  blurring  causes 
the  laUol'f  in  detector  irradiance  at  separations  of  a  lew 
centimeters  and  greater. 

Experiments  also  indicate  that  sequentially  exposed 
holograms  have  sufficient  resolution  to  address  a 
number  of  receivers  spaced  from  several  hundred  mi¬ 
crometers  to  centimeters.  This  could  he  used  to  im¬ 
plement  a  number  of  very  flexible  interconnect  patterns 
without  the  drawbacks  of  conventional  electrical  sys¬ 
tems. 

This  work  was  supported  by  the  Air  Force  Office  of 
Scientific  Research.  One  of  u.s  (RKKt  would  especially 
like  to  thank  IBM  for  fellowship  support  during  this 
period. 
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Abstract.  Optical  beams  are  known  to  have  many  desirable  properties  when 
used  for  providing  interconnections.  Such  interconnections  would  be  used  in  an 
all-optical  computer  based  on  optical  gates,  but  can  be  used  at  various  levels  of 
architecture  in  electronic  computing  systems.  The  fan-out  of  optical  interconnec¬ 
tions  from  one  computing  element  to  N  computing  elements  is  accompanied  by 
an  ,Y-fold  loss  of  light  power  for  each  connection.  Less  obvious  is  the  fact  that 
fan-in  of  connections  from  ,Y  computing  elements  to  a  single  computing  element 
can  in  some  cases  also  be  accompanied  by  an  .Y-fold  loss  of  power. 


1.  Introduction 

Much  attention  is  now  being  given  to  the  possible  use  of  optics  as  a  means  for 
providing  interconnections  in  computing  structures  of  various  kinds  at  various  levels 
of  architecture  [1  3].  The  main  attraction  of  optics  in  this  regard  is  the  freedom  from 
interference  between  adjacent  channels  of  interconnections,  arising  fundamentally 
from  the  fact  that  most  propagation  media  are  linear  at  the  light  levels  that  would  be 
used  for  such  signals.  Interconnection  paths  formed  by  Hows  of  electrons  have  a 
strong  tendency  to  interact,  due  to  the  fact  that  such  Hows  are  composed  of  moving 
charges. 

Optical  interconnections  can  obviously  be  utilized  in  an  all-optical  computer,  for 
which  the  basic  logic  operations  are  performed  by  optical  logic  elements,  perhaos 
based  on  optical  bistability.  However,  they  can  also  play  a  more  immediate  role  in 
hybrid  opto-electronic  computers,  in  which  the  tendency  of  electrons  to  interact  is 
exploited  to  produce  nonlinear  interactions  of  signals  in  electronic  logic  gates,  while 
optics  is  used  to  provide  interconnections  at  some  levels  of  architecture.  Applications 
of  optics  for  interconnections  at  high  levels  of  architecture  (machine-to-machine  or 
processor-to-processor)  are  currently  most  easy  to  realize,  while  optical  interconnec¬ 
tions  at  the  lowest  levels  of  architecture  (t.g.  gate-to-gate  connections)  are  most 
difficult  to  realize. 

In  this  paper  we  examine  some  fundamental  properties  of  optical  interconnec¬ 
tions  related  to  their  fan-in  and  fan-out  properties.  The  term  fan-out  refers  to  the 
splitting  of  a  single  node  or  interconnection  into  several  interconnections,  each 
carrying  the  same  signal.  The  term  fan-in  refers  to  the  coming  together  of  several 
interconnections  into  a  single  interconnection  or  node,  all  of  the  component  signals 
being  added  to  form  a  single  signal.  The  two  cases  are  illustrated  in  Hgure  1 .  \Ye  will 
show  that  optical  and  electronic  interconnections  share  some  properties  but  also 
differ  in  some  fundamental  ways.  In  particular,  we  shall  sec  that  electronic  and 
optical  interconnections  are  quite  similar  with  respect  to  their  fan-out  properties, 
but  can  differ  markedly  in  their  fan-in  properties. 
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Figure  t.  Representation  of  (a)  fan-out  and  ( b )  fan-in. 


2.  Fan-out 

It  is  very  common  in  the  construction  of  complex  logic  circuits  that  the  output  of 
a  single  gate  must  he  sent  to  the  inputs  of  several  gates  that  follow.  A  simple  example 
is  shown  in  figure  2  [  1  ]  in  which  the  output  of  one  inverter  drives  the  inputs  of  several 
inverters  in  parallel.  In  order  to  activate  the  parallel  set  of  inverters,  it  is  necessary 
that  the  current  supplied  hy  the  first  inverter  charge  the  input  capacitances  of  the 
following  inverters  to  the  point  where  the  voltages  across  those  capacitances  all 
exceed  the  logic  threshold  voltage.  Other  examples  can  he  found  at  higher  levels  of 
computer  architecture.  For  example,  in  the  construction  of  a  crossbar  switch  for 
interconnecting  several  processors  and  memory  modules  (see  figure  3),  fan-out  must 
be  present  if  the  switch  is  to  offer  broadcast  capability,  i.e.  the  capability  of  a  single 
module  to  broadcast  a  common  message  to  several  other  modules  simultaneously. 
Again  a  single  output  must  charge  the  inputs  of  a  parallel  array  of  capacitances. 

It  is  tempting  to  believe  that  optical  interconnections  offer  a  distinct  advantage 
ris-d-ris  electrical  interconnections  when  substantial  fan-out  is  present.  I  low  ever,  as 
we  now  argue,  this  is  generally  not  the  case.  An  optical  interconnection  (figure  4  («)) 
is  established  by  driving  an  optical  source  (a  laser  diode  or  an  LED)  with  an  electrical 
current.  The  optical  source  converts  the  flow  of  electrons  into  a  flow  of  photons, 
subject  to  certain  limitations  on  the  efficiency  of  that  conversion.  A  portion  of  this 
How  of  photons  is  incident  on  a  photodetector  at  the  far  end  of  the  interconnection. 


Figure  2,  Fan-out  of  connections  from  one  inverter  to  oilier  inverters. 
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The  photodetector  converts  the  flow  of  photons  into  a  How  of  electrons,  attain  subject 
to  certain  limits  on  conversion  efficiency.  Finally,  the  flow  of  electrons  must  charge 
the  input  capacitance  of  a  gate  to  its  logic  threshold  voltage.  Just  as  the  flow  of 
electrons  must  be  divided  X  ways  if  an  electronic  interconnection  with  .Y-fold 
fan-out  is  to  be  established,  so  too  the  flow  of  photons  must  he  divided  X  wavs  if  an 
optical  interconnection  with  ,V-fold  fan-out  is  desired  (figure  4 (/>)).  In  both  the 
electrical  and  the  optical  cases,  fan-out  by  a  factor  X  will  result  in  an  .Y-fold  increase 
in  the  time  required  to  charge  the  AT  capacitors  at  the  ends  of  the  ,Y  interconnections, 
unless  the  rates  of  electron  and  photon  flows  are  increased  by  a  factor-. Y  to 
compensate.  Thus  an  optical  interconnection  in  effect  suffers  from  the  same 
capacitive-loading  effects  that  an  electronic  interconnection  experiences,  contrarv  to 
what  might  have  been  expected  at  the  start. 

There  is  one  respect  (in  addition  to  the  immunity  of  optical  interconnections  to 
interference  mentioned  earlier)  in  which  optical  interconnections  do  offer  a  potential 
advantage.  If  the  length  of  a  metallized  electronic  interconnection  is  substantial,  then 
the  capacitance  of  the  interconnection  itself  may  become  comparable  to  or  even 
greater  than  the  capacitance  of  the  gate  at  the  far  end.  The  increased  capacitance  will 
result  in  slower  charging  times  and  lower  transmission  speeds  for  the  interconnec¬ 
tion.  An  efficient  optical  interconnection  does  not  possess  any  characteristics  similar 
to  the  capacitance  of  the  interconnection  line  itself.  Therefore  when  long  inter¬ 
connections  are  required,  optics  may  have  a  distinct  advantage.  However,  a  recent 
examination  of  the  chip-to-chip  interconnection  problem  [2],  for  which  inter¬ 
connection  lengths  of  only  a  few  centimetres  were  assumed,  showed  that  the 
capacitances  of  the  metallic  interconnection  lines  were  small  compared  with  the 
capacitances  of  the  bonding  pads,  indicating  that  this  potential  advantage  of  optics 
may  not  be  important  for  short-distance  communication  between  chips. 

One  important  difference  between  optical  and  electronic  interconnections 
becomes  evident  when  further  optical  consequences  of  fan-out  are  fully  considered. 
Such  consideration  requires  the  use  of  the  principle  of  conservation  of  generalized 
et  endue  [3, 4],  often  referred  to  as  the  constant  radiance  theorem  [5].  According  to  this 
theorem,  the  product  of  the  cross-sectional  area  and  the  square  of  the  numerical 
aperture  of  an  optical  beam  must  remain  constant  under  any  lossless  linear 
transformation  of  that  beam.  Thus  the  fan-out  of  a  single  optical  beam  of  cross- 
sectional  area  A  into  X  beams,  each  of  cross-sectional  area  A,  must  be  accompanied 
by  a  reduction  of  the  numerical  apertures  of  the  new  beams  by  a  factor  N  ,Y.  Such  will 
be  the  case  whether  the  optical  interconnections  propagate  in  free  space  or  in 
multimode  waveguides  and  fibres.  This  theorem,  which  is  derived  using  the 
principles  of  geometrical  optics,  does  not  bold  in  the  case  of  single-mode  guides,  for 
w  hich  geometrical  optics  is  not  valid.  The  fact  that  fan-out  of  optical  beams  changes 
the  beam  divergence  has  no  obvious  analogue  in  the  case  of  electronic  interconnec¬ 
tions.  The  implications  of  the  constant  radiance  'hcorem  in  the  case  of  fan-out,  w  hile 
important,  are  overshadowed  by  those  for  the  case  of  fan-in,  to  w  hich  we  now  turn. 


3.  Fan-in 

Just  as  fan-out  of  multiple  connections  from  a  single  logic  gate  is  common,  so  too 
fan-in  of  multiple  connections  to  a  single  logic  gate  is  often  required.  Fan-in  is  also 
required  at  higher  levels  of  architecture.  For  example,  some  forms  of  crossbar  switch 
are  constructed  in  such  a  way  that  all  input  lines  can  simultaneously  address  a  single 
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output  line.  Therefore,  it  is  important  to  consider  the  consequences  of  fan-in  for 
both  electronic  and  optical  interconnections. 

Figure  5  illustrates  a  generic  kind  of  fan-in  connection.  When  the  connections  are 
electrical,  all  sources  of  current  must  be  capable  of  charging  the  input  capacitance  of 
the  final  node  to  the  logic  threshold  voltage.  However,  if  the  output  impedances  of 
the  devices  driving  the  lines  are  finite,  a  portion  of  the  current  generated  bv  one 
source  will  How  back  through  all  other  lines,  causing  the  ri  te  of  charging  of  the 
desired  capacitance  to  be  slower  than  would  be  the  case  w  ith  no  fan-in.  The  degree  to 
which  the  speed  of  the  circuit  is  limited  depends  on  the  output  impedances  of  the 
sources  and  on  the  number  of  such  lines  being  fanned-in  to  a  common  point. 

It  might  appear  at  first  glance  that  optical  interconnections  do  not  suffer  from 
fan-in  limitations  of  the  above  kind.  Indeed  in  some  cases  they  do  not,  but  in  other 
cases  there  is  a  vert'  important  limitation  associated  with  optical  fan-in,  which,  while 
different  in  origin  than  the  effect  encountered  with  electrical  interconnections,  none 
the  less  has  similar  or  even  worse  consequent' s.  The  optical  effect  can  again  he 
viewed  as  a  consequence  of  the  constant  ratiiance  theorem,  and  its  seriousness 
depends  on  the  relationship  between  the  cross-sectional  areas  and  the  numerical 
apertures  of  the  beams  that  are  being  fanned  in,  and  the  same  parameters  of  resultant 
beam  after  fan-in.  If  the  fan-in  of  .V  identical  optical  beams  is  onto  a  detector  w  ith  .V 
times  the  cross-sectional  area  of  the  individual  beams,  or  with  an  acceptance 
numerical  aperture  that  is  N  .V  times  the  numerical  aperture  of  one  of  the  incident 
beams,  then  there  need  be  no  penalties  associated  with  fan-in  (aside  front  the  fact  that 
a  large  optical  detector  generally  has  a  high  capacitance  and  a  correspondingly  slow 
speed).  On  the  other  hand,  it  the  fan-in  requires  that  .V  identical  and  mutually 
incoherent  beams  be  combined  to  form  a  single  beam  with  the  same  cross-sectional 
area  and  the  same  numerical  aperture  as  those  of  any  one  of  the  incident  beams,  then 
the  constant-radiance  theorem  implies  that  the  optical  potcer  delivered  into  the 
resultant  beam  cannot  exceed  I  N  th  of  the  total  incident  iptical  pott  er  carried  by  all  the 
interconnections. 

The  above  conclusion  can  have  profound  effects  on  the  design  of  optical 
interconnections.  For  example,  one  possible  way  to  attempt  an  .V-fold  fan-in  of 
optical  beams  is  In  means  of  a  holographic  optical  element  used  as  a  beam  combiner. 
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The  hologram  is  recorded  by  sequentially  or  simultaneously  recording  the  inter¬ 
ference  patterns  between  a  single  reference  plane  wave  and  .V  object  plane  waves 
travelling;  at  different  ancles  with  respect  to  the  reference  wave  ( (inure  6  («l).  When  .V 
beams  are  incident  on  the  resulting;  hologram  at  angles  duplicating  those  of  the 
original  object  beams,  there  will  be  generated  a  beam  propagating  in  the  direction  of 
the  original  reference  wave,  carrying  contributions  from  all  of  the  incident  beams 
(figure  6(b)).  It  has  been  assumed  for  simplicitly  that  the  wavelength  of  the  light 
exposing  the  hologram  is  identical  to  that  of  the  light  incident  during  the  beam¬ 
combining  operation.  The  cross-sectional  area  and  the  divergence  angle  of  the 
combined  beam  should  be  identical  with  those  of  the  beams  incident  on  the 
combiner.  However,  the  constant-radiance  theorem  implies  that,  on  the  average,  the 
new  beam  can  contain  no  more  than  l/.Yth  of  the  power  from  each  of  the  incident 
beams,  the  average  being  over  all  possible  relative  phases  of  the  incident  beams.  The 
light  not  carried  by  the  combined  beam  can  be  shown  to  appear  in  other  orders  of 
transmitted  light. 

An  intuitive  argument  confirming  the  prediction  of  the  constant  radiance 
theorem  can  be  reached  by  considering  the  same  holographic  element  illuminated  by 
a  backwards  travelling  version  of  the  original  reference  wave.  The  hologram  can  send 
at  most  1  /. Yth  of  the  incident  light  into  each  of  the  back-propagating  versions  of  the 
object  waves.  Thus  any  single  grating  in  the  hologram  can  be  at  best  100  .Y  per  cent 
efficient,  and  in  general  will  be  even  less  efficient. 

It  has  been  implicitly  assumed  in  the  above  arguments  that  the  beams  to  Ire- 
combined  have  random  phases  with  respect  to  one  another.  Such  will  be  the  case  if 
the  beams  to  be  combined  originate  from  different  optical  sources.  It  will  also  be  tile- 
case  when  all  beams  originate  from  the  same  source  unless  the  entire  optical 
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interconnection  system  is  stabilized  to  maintain  absolutely  eonst.mt  paths  in  all 
arms.  Sneh  a  stabilization  seems  unlikely  in  praetiee.  For  any  one  realization  ot 
relative  phases  between  beams,  the  amount  of  optical  power  tr.iiisterred  to  the 
resultant  beam  would  be  influenced  by  interference  between  the  various  contribut¬ 
ing  beams,  and  could  be  greater  than  or  less  than  the  power  predicted  by  the 
constant-radiance  theorem. 

Similar  conclusions  also  apply  if  the  beams  are  travel  I  inn  in  single-mode 
waveguides,  ft  is  known  [6]  that  the  amount  of  power  coupled  into  a  single 
monomode  waveguide  from  a  V -junction  of  two  identical  monomode  waveguides 
(figure  7)  carrying  identical  optical  powers  may  be  as  great  as  tw  ice  the  pow  er  carried 
by  one  of  the  input  guides,  or  may  be  as  small  as  zero,  depending  on  the  relative- 
phases  of  the  two  incident  beams.  If  the  phase  difference  between  the  two  beams 
varies  randomly  and  uniformly  over  In  radians,  then  the  predictions  of  the  constant- 
radiance  theorem  are  obtained,  namely  on  the  average,  one  half  of  the  incident  pow  er 
will  be  transferred  to  the  outgoing  guide. 


4.  Concluding  remarks 

The  assumption  that  optical  interconnections  are  superior  to  electronic  inter¬ 
connections  from  the  view  point  of  fan-out  anil  fan-in  is  in  general  unw  arranted.  The 
fan-out  properties  of  optical  beams  are  essentially  the  same  as  those  of  electrical 
connections.  The  fan-in  properties  of  optical  beams  are  somewhat  more  complex 
than  those  of  electronic  interconnections.  1  f  .V  identical  incoherent  beams  are  to  fan- 
in  to  a  single  beam  with  the  same  cross-sectional  area  and  the  same  angular 
divergence  as  the  input  beams,  then  there  must  be  a  significant  and  fundamental  loss 
of  power  associated  with  the  fan-in  operation.  If  the  .V  beams  are  mutually  coherent, 
then  the  amount  of  power  transferred  to  the  resultant  beam  depends  on  the  relative 
phases  of  the  component  beams,  but  averaged  over  all  possible  relative  phases, 
results  identical  to  those  of  the  incoherent  case  will  be  obtained. 

There  are  still  good  reasons  to  lie  interested  in  optics  for  interconnections, 
principally  the  relative  immunity  of  optical  beams  to  mutual  interference  effects.  A 
second  important  reason  for  interest  rests  on  the  potential  for  constructing  dynamic- 
optical  interconnection  networks,  which  would  allow  rapid  reconfiguration  of 
interconnections  and  thereby  offer  a  new  degree  of  freedom  for  computer  design. 


Figure  7.  A  monommlc  waveguide  't  -junction. 
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The  first  known  experimental  results  of  real  time  optical  defect  enhancement  of  a  periodic  mask  are  reported.  A 
low -intensity  reference  wave  interferes  with  the  Fourier  transform  of  an  object  beam  to  form  a  hologram  in  a  photo- 
refractive  crystal.  The  nonlinear  properties  of  the  crystal  perform  a  filtering  operation,  and  phase-conjugate  read¬ 
out  results  in  a  defect -enhanced  image.  Defects  of  size  10  pm  X  100  pm  have  been  easily  detected  with  high  signal - 
to- noise  ratio,  and  a  discussion  of  performance  limitations  is  presented. 


We  consider  the  problem  of  selectively  enhancin';  de¬ 
lects  in  a  mask  that  consists  of  mostly  periodic  struc¬ 
ture.  This  type  of  problem  in  image  processing  occurs, 
for  example,  in  the  inspection  of  integrated-circuit 
masks.  Digital  techniques  for  inspection  of  a  two- 
dimensional  field,  generally  utilizing  a  dual-scanning 
microscope  system  and  sophisticated  algorithms  for 
comparison  and  detection,  are  complicated  and  time 
consuming.1  Optical  systems,  however,  offer  the 
advantage  of  parallel  processing.  Furthermore,  there 
is  no  excessive  requirement  for  accuracy  in  the  output 
in  terms  of  the  actual  intensity  at  each  point.  It  is 
sufficient  that  the  signal  associated  with  the  defect  be 
much  larger  than  the  signal  associated  with  the  sur¬ 
rounding  periodic  structure,  so  that,  for  example,  a 
thresholding  operation  can  be  used  to  determine  the 
defect  location. 

Optical  spatial-filtering  techniques  to  perform  defect 
enhancement  have  been  examined  in  the  past  with  re¬ 
gard  to  such  applications  as  inspection  of  the  elect  ron- 
beam  collimating  grid  and  the  silicon-diode-array  target 
for  a  television  camera  tube  as  well  as  for  inspection  of 
photomasks  used  in  the  manufacture  of  integrated 
circuits.7  11  These  systems  used  a  filter  in  the  Fourier 
plane  to  attenuate  the  discrete  spatial  frequencies  of  the 
periodic  portion  of  the  mask,  so  that,  on  retransforma¬ 
tion.  only  defects  were  present  in  the  output.  Although 
t  he  results  of  such  systems  were  promising,  the  useful¬ 
ness  of  the  technique  was  limited  by  the  fabrication  time 
or  difficulty  of  the  filter  and  by  the  need  to  use  high- 
quality,  low -/-number  lenses  when  inspecting  objects 
of  large  dimensions.  Recently  the  second  constraint 
was  removed  by  employing  holographic  recording  of  the 
output  combined  with  phase-conjugate  readout.1" 
Although  this  method  has  been  used  to  detect  submi¬ 
crometer  defects.  it  requires  two  processing  steps:  for 
each  mask  to  be  inspected,  a  new  hologram  must  be 
recorded,  and  for  each  different  type  of  mask,  a  new 
photographic  filter  must  be  made. 

We  present  a  method  to  enhance  defects  in  real  time. 
Using  a  photorelractive  crystal.  1  sent  the  crystal  al¬ 
lows  holographic  recording,  filtering,  and  phase-con 
lugale  readout  processes  to  be  performed  simulta- 
neon-k.  Tin-  m.i'k  to  be  inspected  is  placed  in  the 

ut  to  n.Vrj  s\  ooot  ,o  o'.sgun  o 


input  plane,  and  the  defect-enhanced  image  appears  at 
the  output  plane,  in  a  time  limited  only  by  the  time 
constant  of  the  photorefractive  material.  This  time 
constant,  which  depends  on  the  material  used  and  the 
incident  light  intensity,  ranged  from  about  50  to  about 
250  msec  for  our  experimental  parameters.  This 
method  also  differs  from  that  described  above  in  that 
all  operations  are  carried  out  in  the  Fourier  domain.  To 
our  knowledge,  this  work  is  the  first  demonstration  of 
a  real-time  system  for  enhancing  defects  in  a  periodic 
mask. 

The  technique  for  performing  real-time  defect  en¬ 
hancement  is  based  on  two  observations.  The  first  is 
that  the  Fourier  transform  of  a  periodic  object  is  an 
array  of  discrete  spikes  whose  width  depends  inversely 
on  the  input  field  size  and  whose  spacing  depends  in¬ 
versely  on  the  period  of  the  mask.  In  contrast,  the 
Fourier  transform  of  a  small  defect  is  a  continuous 
function  that  is  several  orders  of  magnitude  less  intense 
than  the  periodic  spikes.  The  second  observation  is 
that  the  diffraction  efficiency  of  a  volume  phase  holo¬ 
gram  formed  in  a  photorefractive  medium  is  maximized 
when  the  intensities  of  the  two  writing  beams  are  ap¬ 
proximately  equal  and  decreases  as  the  difference  in 
intensity  increases.  For  a  reference  plane-wave  in¬ 
tensity  t/r)  more  intense  than  the  object -beam  intensity 
(/„),  the  output  is  proportional  to  the  object-beam  in¬ 
tensity:  for  an  object  beam  more  intense  than  the  ret 
erence  beam,  the  output  is  proportional  to  the  intensity 
inverse  of  the  object  beam.  A  typical  diffraction-effi¬ 
ciency  versus  beam-ratio  curve  is  plotted  in  Fig.  I  on  a 
log  log  scale,  assuming  that  beam  ratio  Hi  /  /.'is 
varied  by  changing  I,  while  keeping  /  fixed."  This 
curve  was  generated  by  using  the  standard  Kogelnik 
expression  for  diltraction  elti>  ictnx  : 

>1  =  exp|-lo</  cos  n  1 1  -in  ^  J 

and  substituting  in  parameters  appn.pt  i  c.  i 

Hi i  .Si( )  ,,  l  MS< ) t  crv'tul  ot  t  lu>  km  »s  ,.  -  I  .  h  a  •!•  r 

conditions  o|  illumiu.it ion  . .|  \  -  >1  I  oi  e  •'  .  m 
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)oj;( beam  ratio) 

Fig.  1.  Diffraction  efficiency  versus  beam  ratio.  Shown  is 
|  a  typical  curve  for  photorefractive  BSO  or  BOO. 


portion  of  the  curve,  are  given  in  Ref.  11.)  Therefore, 
a  delect  can  be  enhanced  by  focusing  the  Fourier 
transform  of  the  mask  onto  the  photorefractive  crystal 
and  making  the  intensity  of  the  peak  spectral  compo¬ 
nent  that  is  due  to  the  defect  less  than  or  equal  to  the 
intensity  of  the  reference  beam.  The  intensity  of  the 
spikes  that  is  due  to  the  periodic  structure  will  be  so 
much  greater  than  the  reference-beam  intensity  that 
the  corresponding  diffraction  efficiency  will  be  very 
small.  Thus  the  refractive-index  pattern  formed  inside 
the  crystal  performs  both  recording  and  filtering  oper¬ 
ations. 

T'he  technique  of  using  a  weak  reference  beam  and 
a  strong  object  beam  to  perform  optical  processing  is  not 
new.  Ragnarsson  recorded  filters  in  photographic  film 
with  this  technique  in  order  to  perform  division. I:i  This 
technique  has  also  been  used  in  photorefractives  to 
obtain  edge  enhancement  of  binary  images,  by  both 
Huignard  and  Herriau  in  BSO"  and  by  Feinberg  in 
BaTiO;).1’'  However,  to  our  knowledge,  this  is  the  first 
use  of  the  technique  in  photorefractives  to  enhance  se¬ 
lected  features  in  an  object  beam  and  suppress 
others. 

A  Fourier-optics  analysis  can  be  used  to  describe  the 
propagation  of  light  from  the  object  to  the  crystal. 
Suppose  that  the  mask  has  dimensions  W  X  L  and  that 
a  small  transparent  defect,  located  at  (jr«,  yo),  has  di¬ 
mensions  w  X  l.  Let  p(x,  y )  represent  one  unit  cell  of 
the  periodic  structure,  which  is  spaced  at  intervals  of 
length  a.  The  intensity  of  the  Fourier  transform  at  the 
crystal,  assuming  W,L  »  a  and  unit  illumination,  is 
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where  the  sine  function  is  as  detincd  by  Bracewell."' 
The  spatial  frequencies’  variables  are  related  to  spatial 
variables  as  u  =  x/\f  and  r  =  y/A/,  and  l‘(u,  v)  is  the 
Fourier  transform  of  p(x,  y).  P{ 0,  0)  represents  the 


transmitting  area  of  one  period  of  the  pattern,  and  /'ll), 
0)/a2  is  the  fraction  of  the  mask  area  that  is  transmitt¬ 
ing.  At  the  crystal,  the  hologram  should  be  recorded 
such  that  the  intensity  of  the  periodic  portion  of  |  Tin, 
c )| -  is  greater  than  the  reference-beam  intensity  lr.  and 
the  intensity  of  the  defect  portion  of  |7’(u,  r)|-  is  less 
than  lr.  Mathematically,  if  D  is  defined  as  the  relevant 
dynamic  range  of  the  periodic  portion  and  /,  is  the  in  - 
tensity  incident  upon  the  mask,  then  the  two  conditions 
are 
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Fig.  2.  Kxperimental  setup.  VA/BS,  variable  attenuator/ 
beam  splitter;  BS,  beam  splitter;  CL,  collimating  lens;  PCB, 
polarizing-cube  beam  splitter.  FTL,  Fourier-transform 
lens. 


(a) 


(b) 


Fig.  :t.  Input  mask  and  out  put -defect -enhanced  image.  'File 
coordinates  of  the  seven  detects,  measured  in  units  of  numbers 
of  squares  and  taking  the  center  of  the  lower  left-hand  square 
to  be  (I),  0)  are 


Defect  size 

m-1) _ 

"SOU  X  100 
at  I  X  1(M) 
100X  at) 
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(22.7.a) 
(10.  12.5) 
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Fig.  4.  Intensity  line  scan  of  10  pm  X  100  pm  defect.  Graph 
illustrating  the  signal-to-noise  ratio  obtained  for  the  smallest 
defect. 

If  the  defect  is  opaque  rather  than  transmitting,  then 
the  second  condition  should  be  modified: 


The  experimental  setup  used  to  obtain  defect  en¬ 
hancement  is  shown  in  Fig.  2.  An  argon-ion  laser  (A  = 
514.5  nml  was  collimated  and  split  to  form  the  two 
writing  beams  as  well  as  the  probe  (readout)  beam.  A 
BSO  crystal,  of  size  8  mm  X  8  mm  X  8  mm,  was  oriented 
with  the  x  direction  shown  in  Fig.  1  along  a  [1 10]  axis. 
An  // 4.9  lens  was  used  to  perform  the  Fourier  transform, 
and  the  output  was  detected  by  a  charge-coupled  device 
(CCD)  camera.  The  combination  of  a  half-wave  plate, 
a  polarizing-cube  beam  splitter  (PCB),  and  a  second 
half-wave  plate  allowed  the  beam  ratio  to  be  changed 
while  the  polarizations  were  kept  the  same.  To  improve 
the  signal-to-noise  ratio,  a  polarizer  was  placed  in  front 
of  the  output  to  reduce  the  scattered  light.1 1 

The  object  mask  consisted  of  a  36  X  36  array  of 
squares,  each  with  sides  of  150  pm.  The  spacing  be¬ 
tween  the  squares  was  100  pm,  so  the  period  a  was  equal 
to  250  pm.  The  total  mask  size  was  9  mm  X  9  mm. 
Within  this  array  were  placed  seven  transmitting  de¬ 
fects  of  sizes  100  pm  X  100  pm  down  to  100  pm  X  10  pm, 
as  shown  in  Fig.  3(a).  The  output  of  the  optical  system, 
obtained  using  an  applied  voltage  of  4  kV,  is  shown  in 
Fig.  3(b).  The  periodic  background  has  been  quite  ef¬ 
fectively  suppressed,  leaving  the  defects  clearly  visible. 
Figure  4  shows  an  intensity  scan  of  one  line  of  the  output 
image,  illustrating  the  worst-case  signal-to-noise  ratio 
obtained.  The  defect  represented  is  one  of  the  two  10 
pm  X  100  pm  spots:  thus  the  system  appears  easily  ca¬ 
pable  of  detecting  smaller  defects. 

In  recording  the  hologram,  the  object-beam  intensity 
at  the  mask  (/, )  was  16  mW/em2,  and  the  reference- 
beam  intensity  was  3.0  mW/cm-,  which  led  to  beam 
ratios  at  the  crystal  of  0.014  to  0.00014,  depending  on 
the  size  of  the  defect.  Thus  the  experimental  results 
indicate  that  enhancement  occurs  even  for  values  of  R 
much  less  than  one.  Because  the  inverse  properties 
shown  in  Fig.  1  were  derived  under  conditions  of 


plane-wave  illumination,  the  filtering  properties  of  the 
crystal  cannot  be  described  by  simply  a  beam-ratio 
dependence.  Further  investigation  into  the  actual 
behavior  of  the  crystal  is  currently  being  undertaken. 

The  resolution  obtained  in  the  output  was  con¬ 
strained  by  two  factors.  The  primary  constraint  was 
the  size  of  the  crystal.  (liven  the  /-number  of  the  sys¬ 
tem,  the  crystal  captured  only  the  central  fifth  of  the 
primary  lobe  of  the  sine  function  that  was  due  to  the 
smallest  defect;  therefore  the  output  of  the  system 
produced  the  defect  convolved  with  a  smoothing  func¬ 
tion.  Thus  reducing  the  /-number  of  the  optical  system 
(and  using  a  crystal  of  larger  dimensions)  will  greatly 
improve  the  resolution  capability.  The  second  con¬ 
straint  on  the  resolution  was  the  size  of  the  imaging  el¬ 
ements  of  the  CCD  camera,  each  of  which  measured  23 
pm  X  13.4  pm. 

In  summary,  a  method  to  enhance  defects  in  a  peri  - 
odic  mask  in  real  time  has  been  presented.  A  photo- 
refractive  crystal  is  used  to  perform  holographic  re¬ 
cording,  filtering,  and  readout  process  simultaneously. 
Preliminary  experimental  results  show  detection  of 
defects  down  to  10  pm  X  100  pm  in  size.  Detection  of 
smaller  defects  should  be  possible  by  using  an  optical 
system  with  a  smaller  /-number  and  a  camera  with 
smaller  resolution  elements. 
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University  and  by  the  U.S.  Air  Force  Office  of  Scientific 
Research.  The  assistance  of  Mike  Smith  and  Zora 
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Neural  Networks  for  Computation: 

Number  Representations  and  Programming  Complexity 

* 
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Stanford,  CA  94305 

Abstract 

Methods  for  using  neural  networks  for  computation  are  considered.  The  success 
of  such  networks  in  finding  good  solutions  to  complex  problems  is  found  to  be  depen¬ 
dent  on  the  number  representation  schemes  used.  Redundant  schemes  are  found  to 
offer  advantages  in  terms  of  convergence.  Neural  networks  are  applied  to  the  com¬ 
binatorial  optimization  problem  known  as  the  "Hitchcock  problem",  and  signal  pro¬ 
cessing  problems,  such  as  matrix  inversion,  and  Fourier  transformation  .  The  concept 
of  programming  complexity  is  introduced.  It  is  shown  that  for  some  computational 
problems,  the  programming  complexity  may  be  so  great  as  to  limit  the  utility  of  neural 
networks,  while  for  others  the  investment  of  computation  in  programming  the  network 
is  justified.  Simulations  of  neural  networks  using  a  digital  computer  are  presented. 
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I.  Introduction 

Even  the  fastest  modem  computer  cannot  compare  with  the  brain  of  an  infant  in 
the  performance  of  intelligent  information  processing  such  as  image  processing  and 
pattern  recognition.  This  well  quoted  fact  suggests  the  possibility  of  a  quite  different 
type  of  computer.  The  fundamental  difficulty  in  creating  artificial  intelligence  on  con¬ 
ventional  digital  computers  comes  from  the  large  difference  in  architectures  of  infor¬ 
mation  processing  between  digital  computers  and  human  brains,  i.e.,  the  sequential 
processing  in  von  Neumann  machines  and  the  massively  parallel  computation  in 
human  brains^.  Neuroscientists  have  revealed  that  the  massive  parallelism  and  the 
computational  richness  in  the  human  brain  lie  in  the  global  and  dense  interconnections 
among  a  large  number  of  identical  logic  elements  or  neurons  which  are  connected  to 
each  other  with  variable  strengths  by  a  network  of  synapses  .  An  artificial  neural  net¬ 
work  system  that  can  perform  parallel  computation  and  the  function  of  natural  intelli¬ 
gence  is  extremely  attractive  as  a  future-generat’on  computer. 

However,  there  exist  two  major  problems  that  must  be  attacked  before  the  realiza¬ 
tion  of  such  a  neural  computer.  The  first  is  a  hardware  problem  of  how  to  implement 
those  global  and  dense  interconnections  among  many  neuron-like  logic  elements,  and 
the  second  is  a  software  problem  of  how  to  program  such  highly  parallel  computation 
on  a  neural  network  system.  We  may  take  two  different  approaches  to  the  first  prob- 
lem,  VLSI-based  interconnections  and  optical  interconnections  .  Neurons  in  the 
human  brain  are  interconnected  in  three-dimensional  space  since  it  is  the  most  natural 
and  efficient  way  of  interconnection,  but  VLSI-based  interconnections  are  inherently 
two-dimensional  in  nature.  Optical  signals,  on  the  other  hand,  can  flow  through  three- 
dimensional  space  to  achieve  the  required  interconnects  between  neuron-like  logic  ele¬ 
ments.  Based  on  this  idea,  severai  schemes  of  optical  computing  have  been 
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proposed^,  Among  them,  Psaltis  and  Farhat^  recently  reported  an  optical  imple- 

8  9 

mentation  of  the  Hopfield  neural  network  ’  using  an  optical  vector-matrix  multi¬ 
plier^  as  a  programmable  interconnector,  and  demonstrated  the  feasibility  of  optical 
content  addressable  associative  memory. 

Extensive  studies  have  been  done  on  the  basic  characteristics  of  the  neural  net¬ 
works  themselves^,  but  the  second  problem  of  how  to  program  them  to  do  various 

computations  of  practical  interest  has  not  been  fully  studied  except  in  their  application 
12  1 3 

to  associative  memory  .  Quite  recently,  Hopfield  and  Tank  showed  that  a  certain 
class  of  optimization  problems  can  be  programmed  and  solved  on  their  neural  network 
model.  They  demonstrated  the  computational  power  and  speed  of  their  neural  network 
by  solving  one  of  the  NP-complete  problems  known  as  the  "Traveling-Salesman 
problem."  The  purpose  of  this  paper  is  to  extend  their  idea  and  explore  new  possibili¬ 
ties  of  programming  and  solving  on  neural  networks  other  various  non-biological  prob¬ 
lems  of  practical  interest.  We  emphasize  that  our  goal  is  not  to  propose  mechanisms 
that  might  actually  be  utilized  by  the  brain,  but  rather  to  apply  neural  network  ideas  to 
computational  problems,  and  thereby  to  open  some  new  avenues  for  realizing  powerful 
man-made  computers. 

We  first  review  briefly  the  Hopfield  neural  network  model,  and  describe  some 

minor  modifications.  Next,  we  propose  a  new  scheme  to  represent  numbers  by  neuron 

state  variables,  which  is  essential  in  solving  numerical  problems  on  neural  networks. 

Based  on  this  number  representation  scheme,  we  show  how  we  can  program  and  solve 

combinatorial  optimization  problems^  known  as  network  flow  problems^  or  more 

17 

specifically  as  the  "Hitchcock  problem,"  and  simulate  its  computational  performance 
on  a  digital  computer.  Then,  we  give  a  programming  scheme  to  perform  signal  pro¬ 
cessing  for  signal  recovery,  such  as  the  computations  of  matrix  inversion  and  Fourier 
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transformation.  The  performance  is  again  simulated  on  a  digital  computer. 

The  important  idea  of  programming  complexity  is  then  introduced,  and  it  is 
shown  that  for  some  problems  the  data-dependent  programming  complexity  is  so  great 
that  computations  invested  in  finding  the  right  neural  interconnection  and  bias  patterns 
may  equal  the  complexity  involved  in  solving  the  problem  directly  without  a  neural 
network.  For  such  problems,  neural  networks,  as  we  now  understand  them,  may  not 
be  an  appropriate  architecture  for  computational  problem-solving. 

We  conclude  with  the  discussion  of  the  limitations  and  the  problems  that  remain 
to  be  solved  in  future. 

II.  The  Ilopficld  Model  and  Its  Modifications 
A.  The  Ilopficld  model 

8  9 

The  Hopfield  model  ’  consists  of  a  number  of  mutually  interconnected  nonlinear 
devices  called  "neurons"  whose  states  are  characterized  by  their  outputs  V,-  (which  may 
take  values  between  0  and  1).  The  dynamics  of  neurons  in  the  Ilopfield  model  can  be 
described  in  both  discrete  and  continuous  spaces. 

The  discrete  model  is  illustrated  in  Fig.l.  At  fan-in  terminals  Z(,  each  neuron  i 
receives  inputs  T^Vj  from  other  neurons  j  and  a  bias  input  /,  associated  with  itself; 

ui=h'<jvj+ii’  o) 

j= i 

where  N  is  the  number  of  neurons,  and  Tt]  are  elements  of  an  interconnection  matrix 
representing  the  strengths  of  connections.  At  discrete  times,  switches  SW,  turn  on,  and 
the  inputs  (7,  are  fed  back  to  corresponding  neurons  to  change  their  states  or  to  leave 
their  states  fixed  according  to  a  threshold  rule  determined  by  nonlinear  operators  NLR(, 


such  that 
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V^k+i)  =  stpf  Ut(k)  ),  (2) 

where  k  is  discrete  time,  and  stp(.v)  is  a  unit  step  function  which  is  1  for  a>0,  and  0 
for  ,v<0.  Thus,  neurons  take  binary  values  either  1  or  0,  and  the  binary  outputs  are  sent 
out  from  fan-out  terminals  Q,  and  distributed  through  the  interconnection  network  to 
re-generate  new  inputs  at  the  fan-in  terminals 

In  the  continuous  model,  neurons  change  their  states  according  to  the  following 
equations  of  dynamics: 

N 

dUJdt  =  YJijVi  +  h  (3) 

H 

Vt  =  g(  Ui ),  (4) 

where  t  is  continuous  time,  and  g(x)  is  a  nonlinear  function  whose  form  can  be  taken 
to  be 

g(x)  =  (l/2)[  1  +  tanh(x/x0)  ],  (5) 

which  approaches  a  unit  step  function  as  x0  tends  to  zero. 

Hopfield^  has  shown  that  if  T(y=Ty(  ,  neurons  in  the  continuous  model  always 
change  their  states  in  such  a  manner  that  they  minimize  an  energy  function  defined  by 

£  s  -o/2)S  U'.jViVj  -  Yjy,  (6) 

i=l  /=  1  i=l 

-  g 

and  stop  at  minima  of  this  function.  The  same  is  also  true  for  neurons  in  the  discrete 
model  if  we  further  assume  that  7^=0. 

B.  Neuron  transition  modes 

We  adopt  the  discrete-time  model  because  it  is  much  easier  to  simulate  on  a  digi¬ 
tal  computer.  But  when  Tu*0,  the  model  sometimes  shows  an  oscillatory  behavior  or 
keeps  wandering  around  the  state  space  near  the  minima  of  the  energy  function.  Most 
problems  of  practical  interest  require  self  feedbacks  (T^O)  when  programmed  on  a 
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ncural  network.  We  therefore  need  to  design  transition  modes  that  reduce  such 
phenomena.  Without  claiming  any  similarity  to  natural  neuron  transition  rules,  we 
choose  four  different  discrete-time  transition  modes  for  examination. 

(a)  Direct  synchronous  transition  mode 

All  the  transitions  occur  simultaneously  when  the  switches  SW,  turn  on  in  syn¬ 
chronism  at  discrete  times  k.  The  fan-in  inputs  are  directly  fed  back  to  generate  new 
neuron  states.  A  continuous  nonlinear  function  g(x)  allows  neurons  to  take  state 

values  between  0  and  1.  The  following  equations  are  assumed  to  hold: 

N 

U,(k)  =  ZTijVjik)  +  h  (7) 

7=  i 

Vi(M  )  =  g[U,(k)).  (8) 

(b)  Differential  synchronous  transition  mode 

The  differential  equations  in  the  continuous  model  are  approximated  by  difference 
equations.  Transitions  occur  synchronously.  In  this  case, 

U, (k)  -  U,(k-\)  =  +  h  (9) 

7=  i 

v,(*+l)  =  g[  U,(k)  ]. 

This  mode  requires  one  memory  cell  for  each  neuron  to  keep  its  previous  input. 

(c)  Direct  asynchronous  transition  mode  (random  delays) 

This  mode  is  similar  to  mode  (a),  but  the  switches  SW(  turn  on  and  off  asynchro¬ 
nously,  i.c.  with  random  delays.  In  this  case, 

U,{k  -  At,)  =  Vj(k  -  At,)  +  i,  (10) 

V, (k  -  At,  +  e)  =  g[U,(k  -  At,)] 

where  At,  are  skews  caused  by  time  delays  in  the  network,  and  arc  fractions  of  one 
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clock  time,  while  e  is  a  small  positive  constant.  Without  loss  of  generality  we  can 
assume 

A/j<A:2< 

because  the  numbering  of  neurons  is  arbitrary.  In  this  mode,  one  particular  neuron  i 
need  not  wait  for  the  last  neuron  /V  for  synchronization,  and  when  it  decides  its  new 
state,  it  can  make  use  of  information  about  new  states  of  other  neurons  that  have 
already  renewed  their  states. 

(cl)  Differential  asynchronous  transition  mode  (random  delays) 

This  is  an  asynchronous  version  of  mode  (b).  In  this  case, 

N 

U,(k  -  At,)  -  U,{k  -  A/,-1)  =  £  TtJVj(k  -  At,)  +  (11) 

y=* 

V,(k  -  A /,+e)  =  g[U,(k  -  At,)  -  U,{k  -  A/,—1)] 

Using  simulations  on  a  digital  computer,  we  found  that  the  synchronous  transition 
modes  (a)  and  (b)  gave  rise  to  large  oscillations  in  the  energy  function  when  Tu^ 0,  but 
that  the  asynchronous  transition  modes  (c)  and  (d)  have  greatly  reduced  oscillator}'  or 
wandering  behavior,  though  the  reduction  is  not  complete.  While  mode  (c)  is  quicker 
in  minimizing  the  energy  function,  mode  (d)  has  more  reduced  oscillations.  Depending 
on  the  characteristics  of  the  problems  of  interest,  we  shall  make  a  proper  choice  of  a 
mode  from  (c)  and  (d). 

III.  Number  Representation  Schemes 

In  most  problems  of  practical  interest,  solutions  are  described  by  a  set  of 
numbers.  Therefore  we  must  have  a  means  to  encode  numbers  on  neuron  state  vari¬ 
ables  U(.  While  allowing  neurons  to  take  continuous  state  values  during  the  process  of 
energy  function  minimization,  we  demand  that  they  take  binary  values  of  1  or  0  at  the 
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final  stage  so  that  we  can  obtain  digital  solutions  like  those  given  by  digital  computers. 
For  simplicity,  we  first  assume  the  numbers  are  positive  integers  including  0,  though 
we  can  also  represent  general  bipolar  and  complex  numbers  by  using  additional  neu¬ 
rons.  We  consider  three  different  ways  of  mapping  the  positive  integer  space  Z*  onto 
the  neuron  state  space  V. 

A.  Binary  scheme 

A  common  way  of  representing  numbers  in  digital  computers  is  to  use  binary 
digits.  For  example,  5  is  expressed  by  0101.  This  scheme  uses  log2(/V+l)  bits  to 
express  a  number  N.  If  we  let  one  neuron  represent  one  bit,  we  have  a  one-to-one 
correspondence  between  elements  in  the  number  space  ZZ  and  those  in  the  neuron  state 
space  V.  Despite  the  economy  in  the  number  of  bits  or  neurons  used,  a  system  based 
on  the  binary  scheme  is  not  fault-tolerant.  In  other  words,  even  a  single  failure  in  a 
highly  significant  bit  gives  rise  to  a  large  error  in  the  number  represented. 

II.  Siinplc-sum  scheme 

In  this  scheme,  a  number  is  represented  by  a  simple  sum  of  the  neuron  state  vari¬ 
ables  Vh  i.e.,  the  total  number  of  firing  (Fpl)  neurons.  For  example,  5  is  expressed  by 
0011111,0101111,  1 10101 1,  etc.,  all  of  which  have  five  1-bits.  This  is  a  one-to-many 
mapping  from  ZZ  to  V,  and  the  numbers  have  degenerate  representations.  This  scheme 
requires  N  bits  to  express  a  number  N,  and  is  not  economical  in  the  number  of  bits  or 
neurons.  However,  it  is  highly  fault-tolerant  because  an  error  in  a  single  bit  does  not 
cause  a  large  error  in  the  number  represented.  The  fault-tolerance  of  the  human  brain 
is  believed  to  come  from  this  type  of  averaging  over  a  large  number  of  neurons' '. 

So  far,  we  have  compared  the  binary  scheme  and  the  simple-sum  scheme  from 
the  viewpoint  of  their  fault-tolerance.  More  important  is  their  difference  in  problem- 
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solving  capability.  As  will  be  seen  later,  problems  are  solved  through  a  spontaneous 
energy  minimization  process  in  a  neural  network,  and  the  solution  is  given  by  a  point 
in  the  neuron  state-variable  space  that  is  reached  after  this  minimization  process.  In 
the  binary  scheme,  there  is  only  one  point  in  the  state  variable  space  that  gives  a 
correct  solution.  In  the  simple-sum  scheme,  on  the  other  hand,  multiple  points  give  the 
correct  solution.  Because  of  this  degeneracy  and  the  clustering  of  quasi-minimum 
energy  points  in  the  neuron  state-variable  space,  the  simple-sum  scheme  offers  more 
chances  to  reach  the  correct  solution.  Suppose,  for  example,  3  is  the  correct  solution. 
In  the  simple-sum  scheme,  we  can  get  a  correct  solution  when  the  final  state  is  either 
001 11,  10110,  11100,  or  10 10 1 ,  etc.,  whereas  we  can  get  the  correct  solution  in  the 
binary  scheme  only  when  the  final  state  is  00011.  Simulation  results  reported  later  in 
this  paper  support  the  hypothesized  superiority  of  the  simple-sum  scheme. 

C.  Group-and-weight  scheme 

Despite  its  merit  in  fault-tolerance  and  computational  capability,  the  simple-sum 
scheme  requires  too  many  neurons  when  solutions  include  large  numbers.  We  propose 
the  group-and-weight  scheme  which  lies  between  the  binary  and  the  simple-sum 
schemes.  In  this  scheme,  we  divide  the  total  q  bits  into  K  groups  each  of  which  has  A7 
bits  ((/-KM),  and  interpret  the  groups  as  digits  whose  numbers  are  given  by  simple 
sums  of  the  bits  in  the  corresponding  groups.  For  example,  with  q=6,  K= 2,  A7=3,  5  is 
expressed  either  by  100  100  (4'x(l-K)+0)  +  4°>  -’l+O+O)  =  5),  010  001,  001  010,  or 
100  001  etc.  A  number  expression  for  the  simple-sum  scheme  is  given  by 

XfiA'+n^lVo-wJ-  (>2) 

t=i  i=i 

This  expression  includes  the  binary  and  the  simple-sum  schemes  as  special  cases. 
When  we  put  A/=l  and  K=</,  we  obtain  a  number  expression  for  the  binary  scheme 
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(13) 


fc=i 

and  when  we  put  M=q  and  K=  1,  we  obtain  a  number  expression  for  the  simple-sum 
scheme 


&)■  (14) 

£=1 

The  group-and-weight  scheme  requires  A/  logw+1(AT-l)  bits  to  express  a  number  N. 
This  also  gives  the  number  of  bits  required  in  the  binary  scheme  when  we  put  A/=l, 
and  that  required  in  the  simple-sum  scheme  when  we  put  M=N. 


D.  Bipolar  and  Complex  Integers 


So  far,  we  have  restricted  our  number  representations  to  positive  integers,  but 
they  can  easily  be  extended  to  include  bipolar  and  complex  integers.  A  bipolar 
expression  can  be  obtained  simply  by  adding  a  negative  bias  integer  to  the  expression 
for  positive  integers  given  by  Eq.  (12): 
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h=  1  j 
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|  •  L.  -J  1 

where  (1/2) 


(A/+i; 


-  1  is  ha 


tlf  the  largest  positive  integer  that  can  be  expressed  by 


Eq.  (12),  and  the  floor  operation  [vl  gives  the  nearest  integer  value  less  than  x.  Eq. 
(15)  can  express  bipolar  integers  ranging  over  ±["(1/2)  j(A/+ 1)^-1  jj. 


To  express  complex  integers,  we  need  twice  as  many  neurons,  i.e.,  neurons  \f> 
and  1  that  represent  real  and  imaginary  parts,  respectively.  Complex  integers  arc 
expressed  by 


K 

z 

k=  I 

u=i 


At 


('W+D^'Z^Viithi 


(=1 


At 


(A/+n*-'2;i<V,)Af+i 


i=  I 


-  [(l/2)[(A/+lf  -  l]| 

-  [(1/2)[(A/+1)*-  l]jj 


(16) 
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whcre  j~  =  -1. 

E.  General  Real  and  Complex  Numbers 

We  can  also  express  numbers  with  fractional  digits,  e.g.  13.26,  3.14,  etc.,  by 
using  more  neurons  and  labeling  them  with  negative  subscripts  (;<0),  e.g.  V'_.,  V_|2- 
etc.,  so  that  the  parameter  k  in  the  first  summation  in  Eq.  (12)  can  run  from  a  negative 
integer  -K'\  the  number  representation  becomes 

£ 

Z  (A/+1)*"1 

t=-A." 

Equation  (17)  can  express  numbers  ranging  from  0  to  (M+\)K  -  (M+l)-^  + with  a 
minimum  digit  of  quantization  being  (A/+l)_^  +  1\  Just  as  wc  did  in  subsection  D,  we 
can  easily  modify  Eq.  (17)  to  a  form  similar  to  Eq.  (16),  so  that  it  can  express  general 
complex  numbers.  Again  here,  the  group-and-weight  scheme  includes  the  binary  and 
simple-sum  schemes  as  special  cases.  If  we  put  A/=l  and  Eqs.  (15),  (16),  and 
(17)  give  the  expressions  for  the  binary  scheme.  Likewise,  the  expressions  for  the 
simple-sum  scheme  can  be  obtained  by  substituting  A/=</  and  K- 1  into  Eqs.  (15)  and 
(16),  and  M=q  and  K--K'  into  Eq.  (17). 

Finally,  it  should  be  noted  that  the  number  representation  schemes  we  proposed 
here  are  all  based  on  linear  mapping  of  the  number  space  onto  the  neuron  state  space. 
In  other  words,  numbers  are  represented  by  linear  combinations  of  neuron  state  vari¬ 
ables.  This  is  an  important  point  in  designing  number  representation  schemes  for  the 
Ilopfield  neural  network,  since  the  energy  funct’on  Eq.  (6)  has  a  quadratic  form  with 
respect  to  neuron  state  variables.  Other  nonlinear  mapping  schemes,  like  floating  point 
expressions,  cannot  form  the  energy  function  required  by  the  Ilopfield  model,  because 
the  floating-point  expressions  need  to  have  neuron  state  variables  in  exponents.  This 
certainly  limits  the  possibility  of  covering  a  wide  range  of  numbers  using  a  small 


ZVi )A/+«-  •  07) 

1=1 


-12- 


number  of  neurons,  but  for  a  neural  computer  it  is  not  a  fatal  disadvantage  because  the 
use  of  ample  neurons  with  much  redundancy  is  the  key  to  improving  its  computational 
capability  and  system  stability'. 


IV.  The  Hitchcock  Problem 

Based  on  the  number  representation  schemes  described  in  the  previous  section, 
we  show  how  a  combinatorial  optimization  problem  known  as  the  Hitchcock  prob¬ 
lem^  can  be  programmed  and  solved  on  a  neural  network. 

Suppose  there  are  m  sources  (X=l,  .  .  .  ,  X-m)  for  a  commodity,  with  Sx  units  of 
supply  at  X,  and  n  sinks  (T=l,  .  .  .  ,Y=n)  for  the  commodity,  with  a  demand  DY  at 
as  shown  in  Fig.  2.  If  CXY  *s  the  unit  cost  of  shipment  from  X  to  Y.  the  Hitchcock 
problem  is  to  find  a  flow  fXY  that  satisfies  demands  for  supplies  and  simultaneously 
minimizes  flow  cost.  Thus  the  problem  is  to  minimize 

m  n 

X  X^xy/xy*  (1S) 

x=i  y=i 

under  the  constraints 


X/xy  -  S*  (X-1,2,  .  .  .  ,m),  (19) 

y=t 
and 

m 

X/xv  =  Oy  (Y=l,2 . n).  (20) 

X=1 

In  Table  1,  (a)  is  an  example  of  a  unit  cost  table,  and  (b)  is  an  example  of  a  solution 
represented  in  the  form  of  a  flow  matrix  or  a  transportation  matrix.  The  flow  matrix 
describes,  for  example,  that  from  the  source  at  X=2,  two  units  of  the  commodity 
should  be  sent  to  the  demand  at  Y=  1,  and  one  unit  to  the  demand  at  Y-2. 
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A.  Flow  Matrix  Representation 

Table  2  shows  how  the  flow  matrix  can  be  represented  by  neurons.  We  assign  q 
neurons  to  each  matrix  element  to  represent  its  content  fXY,  so  that  we  use  N=qmn  neu¬ 
rons  in  total  for  the  complete  representation  of  the  flow  matrix.  For  the  convenience 
of  mathematical  treatment,  we  specify  each  neuron  by  a  set  of  three  subscripts  VXYi, 
where  AT  specifies  the  matrix  element  the  neuron  belongs  to,  and  i  specifies  the  posi¬ 
tion  of  the  neuron  in  that  matrix  element.  Since  the  group-and-weight  number 
representation  scheme  includes  the  binary  and  simple  sum  schemes  as  special  cases, 
we  express  the  flow  matrix  elements  fXY  by  the  group-and-weight  scheme: 


K 

fxY=Z 

k=  1 


M 


{M+\)k-'?VxYAk-\)M+i 


i=l 


(21) 


B.  Energy  Function 

We  use  the  spontaneous  energy  minimization  process  of  a  neuron  network  to 
solve  optimization  problems.  Since  the  energy  function  defined  by  Eq.  (6)  has  a  qua¬ 
dratic  form  with  respect  to  neuron  state  variables  E-,  we  find  a  quadratic  function  of 
VXYi  such  that  the  minimization  of  the  function  corresponds  to  minimizing  the  flow 
cost  and  minimizing  violations  of  the  constraints.  An  energy  function  that  satisfies 
such  requirements  is  given  by 

E  =  -(A/2)  tiii  (M+l)4-1  [l  -  2VXYxk_lwJ  (22) 

X=1  Y=  1  fc=l  i=l  L  J 


m 

+(BI  2)£ 

SX~1L  X  X(W+1)*  )VXY,(k-l)M^i 

X=1 

K=1  fc=l  <=l 

HCI2)Z 

m  K  M 

DY~1L  X  X(W+1)  ~lVXY,(k--l)U+i 

r=i 

X=\  k=  1  i=l 

+(0/2) 

m  n  K  M 

Z  I  I  ^Cxy(M+\)  ~  VxY,(k-\)Mu 

L*=> 

Y= 1  t=l  i=l  J 

where  A,  B,  C,  and  D  are  positive  weight  factors.  The  first  term  weighted  by  A  is 
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introduced  for  the  binarization  of  the  neuron  state  variables  VXY<i>  >  e-  ^XY,i  =  1  or  0. 
Because  the  function  F(\ 0  =  -(1-2V)2,  (0<V^<1 )  takes  minimum  values  at  V=0  and 
E=l,  minimizing  this  term  assures  that  the  final  solution  is  given  by  binary  numbers. 
The  second  term,  weighted  by  B,  is  introduced  to  minimize  violations  of  the  source 
constraints  given  by  Eq.  (19).  Likewise,  through  minimization  of  the  third  term  with  a 
weight  C,  we  can  satisfy  the  demand  constraints  given  by  Eq.  (20).  The  last  term, 
weighted  by  D,  is  for  minimization  of  the  total  flow  cost.  The  total  cost  is  squared  in 
Eq.  (22),  but  we  may  also  introduce  it  without  squaring,  because  the  cost  is  always 
positive.  Note  that  the  way  we  define  the  energy  function  is  not  unique,  so  that  we 
can  solve  the  same  problem  by  using  different  programs  on  the  neural  network,  just  as 
is  often  the  case  in  solving  problems  on  conventional  digital  computers. 

Considering  the  various  terms  represented  in  Eq.  (22),  it  can  be  seen  that  solu¬ 
tions  with  low  energy  do  not  necessarily  correspond  to  solutions  with  low  cost.  How¬ 
ever,  if  the  weighting  constants  are  properly  chosen,  then  the  binarization,  source  and 
demand  constraints  will  eventually  all  be  perfectly  satisfied,  resulting  in  a  one-to-one 
relation  between  energy  and  cost.  Thus  eventually  low  energy  solutions  will 
correspond  to  low  cost  solutions. 

C.  Interconnection  Matrix 

By  analogy  with  digital  computers,  if  we  regard  the  expression  for  the  energy 
function  Eq.  (22)  as  a  source  program,  then  the  next  step  is  to  compile  or  map  it  onto 
the  interconnection  strengths  T ^  of  the  neural  network.  This  can  be  done  by  compar¬ 
ing  Eq.  (22)  with  the  energy  function  Eq.  (6),  which  is  now  written  as 

m  n  K  M  m  n  KM 

£  =  -(1/2)^  X  Z  Z  2  £  £  £ ExT^-ino,; xT,(*'-nw+i'  fxr,(t-i),w+.  Vyrxk'-DM*,' 
x=i  r=i  *=i  «=i  x=i  r=  i  i=i 

m  n  K  M 

-£  £  £  £kxK,(*-l).Vi  lxUk-\\V*i 

X=1  K=|  t=l  ,=  1 


I 


(23) 
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where  denotes  the  strength  of  the  interconnection  between  the 

neuron  at  the  |(X-l)A/-t-/jth  position  in  the  flow  matrix  element  at  AT,  and  the  neuron 
at  the  j(&'-l)A/+/'jth  position  in  the  flow  matrix  element  at  X')".  By  equating  the 
corresponding  coefficients  of  the  two  quadratic  equations  (22)  and  (23),  we  can  deter¬ 
mine  the  interconnection  strengths  and  the  biases: 

^XY,(k-i)M+r,  =  4/4(A/+l )i_16xr5rr5u-5ir  (24) 

-  B(M+l)k+*~2dxx-  -  C(M+l)M~28yy  -  D(M+\)m~2CX}Cxt, 

and 


%,(*-l)A/+«  =  -2A(M+l)k~l  +  B(M+\)k~'Sx  +  C(M+\)k-]Dy,  (25) 

where  is  a  Kroneker  delta  defined  by 

8  _  fl  (Z=Z') 

~  jo  (z±zy 

In  Eq.  (24),  the  first  term  describes  self-feedbacks,  the  second  and  third  terms 
represent  local  interconnections  between  neurons  in  the  same  row  (X'=X)  and  in  the 
same  column  ( Y’~Y ),  respectively.  The  last  term  describes  the  global  interconnections 
between  all  neurons.  If  we  put  A/=l  and  K=q,  we  obtain  the  interconnection  strengths 
and  the  biases  for  the  binary  number  representation  scheme: 


Txymxtx  -  4A2k  ’5xx'5yr5U/ 

-  B2k^0xx-  C2^~%r  -  D2k^~2CXYCXT, 


(26) 


and 


IXYk  =  -A2k+B2k-]Sx+C2k~]DY.  (27) 

Likewise,  the  interconnection  strengths  and  the  biases  for  the  simple-sum  scheme  can 
be  obtained  by  putting  M=q  and  K=l: 

TxY.iXY’/  =  4'4SA-r8)-y-511-  -  Bbxx-  -  C5)r  -  DCX]€x  r, 


and 


(28) 
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IxY.i  =  -2  A+BSx+CDy.  (29) 

D.  Numerical  Experiments 

To  examine  the  computational  performance  of  a  neural  network,  we  simulated 
state  transitions  of  neurons  by  using  a  digital  computer.  We  used  the  unit  costs  and 
the  source  and  demand  constraints  listed  in  Table  1.  Based  on  these  data,  we  deter¬ 
mined  the  interconnection  strengths  and  the  biases.  Since  at  present  we  have  no  sys¬ 
tematic  methods  for  finding  the  best  combination  of  the  weighting  factors  A,  D,  C,  and 
D,  they  were  found  empirically  through  the  observation  of  several  experimental  results. 
The  lack  of  a  systematic  method  for  finding  the  weighting  factors  should  not  be  too 
disturbing.  Such  a  situation  is  commonly  encountered  in  solving  multiple-target  op¬ 
timization  problems  (on  a  conventional  digital  computer),  such  as  lens  design  problems 
and  color  matching  problems.  However,  it  should  be  emphasized  that  the  ability  to 
obtain  a  good  solution  depends  strongly  on  making  good  choices  for  A,  D,  C,  and  D. 
Throughout  the  experiments  with  the  Hitchcock  problem,  we  used  the  direct  asynchro¬ 
nous  transition  mode  and  the  nonlinear  function  given  by  Eq.  (5)  with  0.1<xo<l. 

Figure  3  shows  an  example  of  the  reduction  of  energy  performed  by  a  network 
with  N= 60  neurons  that  represent  the  flow  matrix  based  on  the  binary ;  number 
representation  scheme  (N=qmn=3x4x5=60,  M=  1,  K= 3).  Table  3  shows  the  flow 
matrices  obtained  at  several  points  on  the  curve  of  Fig.  3.  The  weight  factors  were 
chosen  as  A=  27,  Z?=C=80,  and  D=0. 2.  Since  we  have  no  a  priori  knowledge  about  the 
solution,  uniformly  distributed  random  numbers  between  U  and  1  were  generated  and 
assigned  to  the  initial  states  of  the  neurons.  Starting  from  a  very  high  energy  state,  the 
neural  network  reduced  its  energy  spontaneously  by  changing  its  state  so  that  the  flow 
matrix  could  satisfy  the  constraints  while  minimizing  the  total  cost.  After  six  itera¬ 
tions,  we  reached  feasible  solutions  (marked  by  open  circles)  that  satisfied  all  the 
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constraints  and  gave  40  as  the  total  cost. 

After  arriving  at  a  solution  using  the  neural  network,  it  is  important  to  develop 
some  understanding  of  how  good  that  solution  might  be.  To  achieve  this  end,  one 
could  enumerate  all  the  feasible  solutions  that  satisfy  the  constraints,  and  from  this  set 
determine  the  best  solution.  However,  since  it  is  very  hard  to  enumerate  all  the  solu¬ 
tions  of  under-determined  simultaneous  integer  equations,  Eqs.  (19)  and  (20)  (which 
belong  to  a  family  of  Diophantine  equations),  we  used  a  Monte  Carlo  method  and 
found  50,000  feasible  solutions.  (Note  that  this  calculation  was  performed  simply  to 
check  how  well  the  neural  network  had  performed.)  Figure  4  shows  a  cost  histogram 
of  the  feasible  solutions  found.  The  solution  with  cost  40  is  found  to  be  one  of  the 
very  good  solutions,  which  would  be  reached  only  with  a  probability  of  6xl0"5  if  we 
searched  randomly  among  the  feasible  solutions.  Yet  it  is  still  not  the  best  solution, 
which  was  confirmed  to  be  38  by  using  a  stepping  stone  algorithm.  Figure  5  and 
Table  4  show  another  example,  for  which  we  assigned  0.5  to  the  initial  states  of  all 
neurons,  so  that  they  started  evolving  from  the  fuzziest  suites.  In  this  example,  we 
reached  a  feasible  solution  with  cost  49  at  the  seventh  iteration,  but  we  could  not  reach 
any  other  feasible  solutions  by  further  iterations.  The  oscillatory  behavior  of  the 
energy  function  arises  from  using  a  discrete  model  with  self-feedback.  The  solution 
with  cost  49  is  fairly  good  but  not  as  good  as  in  the  previous  example.  Experiments 
performed  with  different  initial  values  and/or  weight  factors  gave  solutions  most  fre¬ 
quently  with  costs  around  50,  and  could  not  pick  up  the  best  solution.  In  worst  cases, 
no  feasible  solution  could  be  reached.  These  results  are  indicative  of  the  limitations  of 
the  problem-solving  capability  of  the  binary  number  representation  scheme.  As  we 
now  show,  much  better  results  can  be  obtained  with  a  degenerate  number  representa¬ 
tion  scheme. 
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To  examine  the  problem-solving  capability  of  the  degenerate  number  representa¬ 
tion  schemes,  we  programmed  the  same  problem  on  a  140-neuron  network  using  the 
simple-sum  scheme  (/V  =  qmn  =  7x4x5  =  140,  M=l,  K- 1).  Figures  6  and  7  and 
Tables  5  and  6  show  the  computational  performance  of  the  140-neuron  network  with 
its  initial  states  all  set  equal  to  0.5,  the  fuzziest  states.  Weight  factors  were  chosen  to 
be  A= 29,  D= 80,  C=80,  and  D-0.55.  Through  the  first  several  iterations,  the  source  and 
demand  constraints  came  to  be  almost  satisfied  (see  Fig.  6  and  Table  5),  and  at  the 
sixth  iteration  the  first  feasible  solution,  with  cost  43,  was  reached  (see  Fig.  7  and 
Table  6).  The  solution  was  improved  further  by  continuing  iterations,  passing  another 
feasible  solution  with  cost  40  at  the  tenth  iteration,  the  best  solution  with  cost  38  was 
finally  reached  on  the  twenty-first  iteration.  To  show  the  role  played  by  the  degen¬ 
eracy  of  the  number  representation,  the  complete  states  of  the  140  neurons  are  depicted 
in  Fig.  8  for  the  iterations  from  No.  21  through  No.  28.  Each  neuron  is  represented  by 
a  star  when  it  is  firing  (VXY,i  ~  1)  and  by  a  dot  when  not  firing  (Fat,;  =  0).  The 
number  of  neurons  that  are  firing  in  each  set  of  seven  neurons  represents  the  content  of 
the  flow  matrix  element  fXY  at  the  corresponding  position.  At  iteration  No.  21,  for 
example,  we  had  y25=  1  because  only  one  neuron  F253  was  firing  (F253=l)  and  the  rest 
of  the  six  neurons  were  not  firing.  At  iteration  No.  22,  neuron  F^  3  stopped  firing,  but 
the  correct  solution  f2s=\  was  retained  because  the  next  neighbor  neuron  F25  2  started 
firing,  instead  of  ^25,3-  We  can  observe  a  similar  phenomenon  in  other  sets  of  neurons 
representing /35  and  /45  at  iterations  No.  21,  22,  23,  25,  26,  and  27.  In  this  manner, 
the  neural  network  can  give  correct  solutions  at  many  different  points  in  its  state 
space,  and  these  points  cluster  in  a  particular  region  of  the  state  space  that  corresponds 
to  low  energy  function  values.  It  is  because  of  this  characteristic  that  the  degenerate 
number  representation  scheme  can  have  better  problem-solving  capabilities  than  the 
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pure  binary  number  representation  scheme. 

Figure  9  and  Table  7  show  another  example  of  the  computational  performance  of 
the  140-neuron  network,  where  uniform  random  numbers  between  0  and  1  were 
assigned  to  the  initial  state  variables  of  the  neurons.  In  this  example,  we  obtained  two 
different  solutions  with  cost  38,  showing  that  the  best  solution  is  not  unique. 

V.  Simultaneous  Equations 

In  this  section  we  show  how  we  can  program  and  solve  on  a  neural  network 
simultaneous  equations 

Hx  =  y  (30) 

where  II  is  a  full-rank  square  matrix  with  NxN  elements,  and  x  and  v  are  vectors  with 
A7  elements  representing,  respectively,  unknown  and  given  variables.  (Note  that  decon¬ 
volution  is  a  special  case  of  this  general  problem.) 

A.  Energy  Function 

In  order  to  use  the  spontaneous  energy-minimization  process  of  the  neural  net¬ 
work,  we  reformulate  the  problem  in  the  form  of  a  minimization  problem  by  introduc¬ 
ing  an  energy  function  that  includes  a  term 

lly  -  Hx||2,  (31) 

so  that  the  norm  of  the  difference  can  be  minimized  through  the  energy  minimization 
process.  For  our  later  demonstration  of  the  Fourier  transformation,  we  allow  y  and  II 
to  take  on  complex  values,  but,  for  the  sake  of  simplicity,  we  restrict  x  to  only  positive 
integer  values,  although  we  could  include  complex  numbers  by  using  additional  neu¬ 
rons  labeled  by  a  more  complicated  set  of  subscripts.  As  in  Eq.  (21),  we  express  the 
nth  element  xn  of  the  unknown  vector  x  by  the  group-and-weight  scheme: 
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*n  =  X 

fc=l 


i=l 


By  substituting  Eq.  (32)  into  Eq.  (31),  we  have  an  energy  function 

N  a:  5f  r  -12 

E  =  -(,t  2)V  V  Vf.U+l)*-1 

_  —  i  r —  ’  i  L  -J 


V 

s 

+(/»>2)X 

>7  - 

>•’/  ~  X 

f=l 

l  "=> 

"'=■1 

(32) 


(33) 


=  -(A/2)X  I  i(A/+l)‘-1[l-2V'„,(tM)AJ‘ 

*=1  fc=l  1=1  L  J 

+  (fi/2)£  Z  S  Z  Z  Z  2)(A/+1)*+* 

/=1  W=l  rt'=l  fc=l  £'=  I  1=1  »'=1 

~  X  X  'RcLyAa  )'  «,(*-l>W+i 

i=l  rf=  1  fc- 1  i=  1 

+  (B/2)X'y,|:, 

t=i 

where,  as  in  Eq.  (22),  the  first  term  is  for  binarization,  >7  and  hln  are  elements  of  y  and 
II,  and  *  and  Re[  ]  denote  complex  conjugate  and  real  part,  respectively. 


15.  Interconnection  Matrix 

The  energy  function  is  now  modified  to 

E  -  -(E2)X  ZZ  Z  Z  ZZfn.(t-4'r+<>\(C-i).w+i'  5/«.(*-i).w4i  l/n,,(t'-i).w+r  (34) 

«=1  fc=l  i=l  n'=\  k'=\  i'=l 

N  K  M 

“X  X  I nXk-\)M+i‘ 

'  '  n=l  h=l  i=l 

By  equating  the  corresponding  coefficients  of  Eq.  (33)  and  (34),  we  determine  the 
interconnection  strengths  and  the  biases: 


)\{+i,n',(k'-])M+t'  -  4/t(A/+l)  5 nn'5U'Stl—  Z?(A/+t)*+*  2£h In/i 

/=1 


/n.(*-i).w„  =  — 2A(A/+1)*_I  +  B(,U+l)*-'Re 


SX* 

/=i 


Equation  (31)  includes  the  discrete  Fourier  transform  as  a  special  case  with 


(35) 

(36) 


hm  =  cxp[-2nj(l- 1 )(«- 1  )//V],  (37) 

and  the  inverse  transform  is  computed  by  solving  the  simultaneous  linear  equations. 


In  this  case,  Eq.  (35)  takes  a  simple  form  due  to  the  orthogonality  of  the  Fourier 
transform  matrix: 

=  44(44+1  )*  1  (3S) 

-BN(A/+l)k+l'-25^. 

C.  Numerical  Experiments 

Computations  of  the  inverse  Fourier  transform  were  programmed  on  the  neural 
network  and  the  performance  was  simulated  on  a  digital  computer.  We  used  signals 
with  N=15  sample  points.  Each  sample-point  jc„  was  expressed  by  24  neurons  based 
on  the  simple-sum  scheme  (Af=24,  A"=l),  so  that  360  neurons  were  employed  in  total. 
We  adopted  the  differential  asynchronous  transition  mode,  and  chose  weight  factors  as 
A=28  and  B=  1.  In  Fig.  10,  (a)  and  (b)  show,  respectively,  an  original  signal  x  and  its 
Fourier  transform  y  (only  absolute  values  are  shown  in  the  figure).  The  task  given  to 
the  neural  network  is  to  compute  x  from  a  given  y. 

Assuming  no  a  priori  knowledge,  we  started  from  the  fuzziest  initial  states 
=  0.5  shown  in  Fig.  10  (c)  and  got  the  result  shown  in  Fig.  10  (d)  after  only  two 
iterations.  Another  example  is  shown  in  Fig.  11,  where  we  used  an  asymmetric  signal 
and  started  from  random  initial  states.  Again  after  only  two  iterations  we  obtained  the 
result  shown  in  Fig.  11  (d).  Although  the  solutions  obtained  are  not  exact,  the  speed 
of  computation  is  impressive.  In  fact,  this  apparently  enormous  speed  of  computation 
is  quite  misleading,  for  reasons  that  will  be  revealed  later  in  the  following  section. 

VI.  Computational  and  Programming  Complexities 

As  has  been  demonstrated  in  Sections  IV,  and  V,  the  computational  speed  of  a 
neural  network  is  very  high,  solutions  (though  not  always  exact)  being  obtained  within 
several  clock  times  (iterations).  At  present,  we  do  not  know  how  the  computation  time 
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(thc  number  of  iterations  required)  is  related  to  the  problem  size  (the  number  of  neu¬ 
rons  employed)  and  to  the  algorithm  (the  choice  of  the  interconnections).  We  conjec¬ 
ture  that  the  computation  time  does  not  grow  too  rapidly  with  problem  size,  because 
the  greater  the  problem  size,  the  more  neurons  participate  in  solving  the  problem,  and 
the  higher  the  parallelism  used.  If  this  conjecture  is  correct,  the  computation  time  is 
very  short  for  a  properly  programmed  (interconnected)  neural  network,  irrespective  of 
the  problem  size.  It  may  appear,  then,  that  neural  networks  would  be  the  computation 
architecture  of  choice  in  most  problems  that  can  be  included  within  the  energy  minimi¬ 
zation  framework.  However,  this  conclusion  is  not  correct.  Although  tire  computation 
time  itself  may  be  very  short,  it  may  be  necessary  to  invest  very  significant  computa¬ 
tion  time  simply  to  program  the  network,  i.e.  to  determine  the  proper  interconnection 
strengths  and  neural  biases.  The  situation  is  somewhat  analogous  to  the  classical  ana¬ 
log  electronic  computer  for  which  a  large  amount  of  time  must  be  spent  wiring  the 
proper  modules  together  before  any  problem  can  be  solved.  Once  the  modules  are 
connected,  a  solution  appears  almost  immediately. 

A.  Programming  Complexity 

By  analogy  with  the  concept  of  computational  complexity^  ’  ^  in  digital  com¬ 
puting,  we  introduce  the  concept  of  programming  complexity  in  neural  computing.  We 
define  programming  complexity  as  the  number  of  arithmetic  operations  that  must  be 
performed  to  determine  the  proper  interconnection  strengths  and  neural  biases  for  the 
problem  to  be  solved.  Conventional  digital  computers  also  need  programming,  but 
once  the  program  is  compiled  and  stored  in  memory,  it  can  be  used  on  many  different 
sets  of  input  data.  For  this  reason,  the  concept  of  programming  complexity  has  little 
significance  in  the  world  of  conventional  digital  computers,  where  programs  are  com¬ 
pletely  separable  from  data.  In  neural  network  computers,  a  program  and  data  are 
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generally  mixed  together  and  stored  in  the  interconnection  strengths  and/or  neural 
biases.  For  example,  in  Eq.  (24),  the  first  three  terms  represent  part  of  the  program 
(since  they  do  not  depend  on  data),  and  the  last  term,  including  the  costs  CXT, 
corresponds  to  the  data.  Therefore,  we  must  redetermine  the  interconnection  strengths 
and/or  the  biases  each  time  we  use  new  data.  In  such  an  environment,  the  program¬ 
ming  complexity  becomes  an  important  measure  of  the  efficiency  of  neural  computing. 
We  know  that  it  is  not  meaningful  to  compare  the  efficiencies  of  conventional  digital 
computers  and  neural  computers  on  the  basis  of  computational  complexity  and  pro¬ 
gramming  complexity,  because  they  mean  different  things.  Digital  computers  always 
give  exact  solutions  (within  the  machine  precision)  after  performing  the  number  of 
operations  specified  by  the  computational  complexity,  whereas  neural  computers  do  not 
guarantee  exact  solutions  even  if  they  are  programmed  by  performing  the  number  of 
operations  specified  by  the  programming  complexity.  Nevertheless,  a  comparison  of 
the  computational  complexity  and  the  programming  complexity  does  reveal  certain 
interesting  aspects  of  neural  computing,  as  discussed  in  the  following  section. 
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B.  Simultaneous  Equations 

To  solve  simultaneous  equations  with  N  unknown  variables,  we  employed  qN 
neurons,  with  q  being  the  number  of  neurons  used  to  represent  each  unknown  variable. 
W  e  consider  q  to  be  a  constant  factor,  since  it  does  not  depend  on  A'.  The  number  of 
interconnections  is  given  by  (\/2)qN(q\’+ 1  )=(9(A~),  and  the  number  of  biases  is 
q\~0(X).  We  need  0( A7)  operations  to  determine  each  interconnection  strength  (see 
Eq.  (35)),  and  each  bias  (see  Eq.  (36)),  so  that  the  programming  complexity  is  0{ A'3). 

,  14 

The  computational  complexity  of  this  problem  is  also  0(N  ).  .  This  means  that  solu¬ 

tions  of  such  a  problem  on  either  a  neural  computer  or  a  conventional  digital  computer 
would  require  essentially  the  same  computational  load.  In  the  case  of  the  neural  com¬ 
puter,  the  computations  must  be  expended  to  determine  the  interconnection  stengths 
and  biases,  while  in  the  case  of  the  conventional  digital  computer  the  computations  are 
expended  on  solving  the  problem  itself. 

This  comparison  is  even  more  striking  in  the  case  of  the  Fourier  transformation 

discussed  earlier.  Since  Eq.  (38)  contains  no  data  terms,  we  need  not  recompute  the 

interconnection  strengths  for  each  different  set  of  data.  The  programming  complexity 

N 

comes  only  from  computation  of  the  term  £/;ln *y/  >n  the  biases,  Eq.  (36).  Noting  Eq. 

/=i 

(37),  we  find  that  to  determine  the  proper  biases,  we  must  in  fact  compute  the  very 
same  inverse  Fourier  transform  that  the  neural  network  was  to  find!  Thus  we  have 
already  arrived  at  the  solution  by  the  time  we  finish  programming,  and  it  is  now  no 
surprise  the  neural  network  supplies  the  answer  in  only  two  interations.  The  answer  is 
in  fact  pre-programmed  into  the  machine! 


C.  The  Traveling-Salesman  Problem 

In  the  previous  section  we  saw  an  example  in  which  the  programming  complexity 
of  a  neural  computer  and  the  computational  complexity  on  a  conventional  computer 
are  of  the  same  order.  The  question  naturally  arises  as  to  whether  this  is  the  case  with 
all  problems.  If  so,  neural  computing  loses  most  of  its  attractiveness.  Hopfield  and 
Tank's  paper  on  the  traveling  salesman  problem  provides  the  best  example  with 
which  to  answer  this  question.  The  computational  complexity  of  the  traveling  sales¬ 
man  problem  is  an  exponential  function,  0(N !),  of  the  number  of  cities  N.  Hopfield 
and  Tank  showed  that  the  problem  can  be  programmed  on  a  neural  network  with  N2 
neurons  that  represent  the  elements  of  a  permutation  matrix.  We  can  show  that  the 
programming  complexity  of  this  scheme  is  0(N 3).  This  large  difference  of  complexi¬ 
ties  makes  neural  computing  very  attractive,  even  though  it  does  not  guarantee  the  best 
solution. 

C.  The  Hitchcock  Problem 

Computational  complexity  in  conventional  digital  computing  depends  greatly  on 
the  algorithms  used,  so  that  a  great  effort  has  been  made  by  computer  scientists  to 
seek  better  algorithms  and  thereby  reduce  computational  complexity.  The  same  can  be 
true  with  programming  complexity  in  neural  computing.  The  Hitchcock  problem  pro¬ 
vides  a  good  example  for  demonstrating  good  and  poor  algorithms  (ways  of  intercon¬ 
nection)  in  terms  of  programming  complexity.  In  Section  IV,  the  Hitchcock  problem 
with  m  sources  and  n  demands  was  solved  by  using  qmn'--O(mn)  neurons.  Since  Eqs. 
(24)  and  (25)  include  data  CXy,  Sx,  and  Dy,  we  have  to  redetermine 
( \l2)qmn{\+qmn)~0{nrn 2)  interconnection  strengths  and  qmn~0{mn)  biases  for  each 
new  set  of  data.  Eaclt  interconnection  strength  and  bias  can  be  determined  by  a  con¬ 
stant  number  of  operations,  so  that  the  programming  complexity  is  given  by 
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O(nrrr)  =  0(nA)  for  m~n.  In  Section  IV  B,  we  suggested  an  alternative  definition  of 
the  energy  function  that  does  not  square  the  total  cost  in  the  last  term  of  Eq.  (22).  If 
we  use  this  new  energy  function,  the  interconnection  strengths  and  biases  become 

Txuk- 1  )M+iy  r,(r'-i  =  4/\(A/+l)i  15<v;\"S>'y-5u-5„-  (39) 

-  £(.U+l)ur-25AX  -  C(A/+l)tu''"25yr, 

/xy,(Jt-i)A  Ui  =  — 2/\(A/+l)i_1  +  B{M+\)k~lSx  +  C(M+\)k~]DY  (40) 

-  ( 1  /2 )D  (A/+ 1 ) *~ 1  CXy 

Now  the  interconnection  strengths  do  not  depend  on  the  data  and  they  need  not 
be  re-determined  for  each  new  set  of  data,  so  that  the  programming  complexity  comes 
only  from  the  biases,  Eq.  (40),  and  is  given  by  0{nm)~0{rr)  for  m~n.  This  is  a  very 
significant  improvement.  The  computational  complexity  of  the  Hitchcock  problem 
depends  on  the  algorithm  used  by  a  conventional  digital  computer.  If  we  search  for 
the  best  solution  randomly  among  all  the  possible  combinations  of  the  neural  states,  it 
becomes  2qmn^O{2,nn).  Even  if  we  restrict  the  search  to  feasible  solutions,  it  can  still 
be  exponential  0{nm~x mn~x)  Of  course,  these  algorithms  are  worst  extremes,  and 
there  exist  several  good  algorithms  that  are  in  practical  use.  We  do  not  know  exactly 
what  is  the  computational  complexity  of  the  best  existing  algorithm  for  the  Hitchcock 
problem,  but  we  estimate  it  to  be  a  low-order  polynomial.  If  it  is  still  higher  than 
O(mn),  then  neural  computing  can  have  an  advantage  for  this  problem. 

VII.  Conclusion 

Following  the  lead  of  Hopfield  and  Tank,  we  proposed  an  architecture  for  pro¬ 
gramming  highly  parallel  computation  on  neural  networks.  In  Section  III,  we  described 
number  representation  schemes  based  on  linear  mapping  of  the  number  space  onto  the 
neuron  space,  and  pointed  out  the  advantage  of  the  degenerate  number  representation 


-27- 


schemes.  In  Sections  IV, and  V,  the  validity  of  the  architecture  was  demonstrated  by 
solving  the  Hitchcock  problem  and  simultaneous  linear  equations  on  neural  networks. 
The  dynamics  of  the  neural  network  were  simulated  on  a  digital  computer.  In  Section 
VI,  we  introduced  the  new  concept  of  programming  complexity  in  neural  computing, 
which  was  used  to  evaluate  the  computational  efficiency  of  algorithms  performed  on 
neural  networks.  We  compared  the  programming  complexity  with  the  "worst  case" 
computational  complexity,  simply  because  the  "average"  complexity  was  too  hard  to 
estimate.  However,  we  note  that  programming  complexity  is  better  compared  with 
"average"  computational  complexity,  because  they  have  a  common  characteristic  that 
the  solution  is  not  always  best  or  exact,  even  if  we  perform  the  number  of  operations 
specified  by  these  complexities. 

Finally  we  point  out  that  there  exists  a  fundamental  limitation  to  the  class  of 
problems  that  can  be  programmed  and  solved  on  the  Hopfield  neural  network.  This 
limitation  comes  from  the  requirement  that  the  energy  function  must  be  a  quadratic 
function  of  the  neuron  state  variables.  All  linear  problems,  such  as  discussed  in  this 
paper,  can  satisfy  this  requirement.  However,  general  nonlinear  problems  cannot 
satisfy  this  requirement.  Floating-point  number  representation  is  one  such  nonlinear 
problem. 
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Figure  Captions 


1.  Neural  network  model. 

2.  The  Hitchcock  Problem,  with  4  sources  and  5  demands. 

3.  Neural  dynamics  for  the  Hitchcock  problem,  using  a  binary  number  representation 
scheme.  The  initial  states  are  randomly  generated  from  a  seed,  and  the  transition 
mode  is  direct  asynchronous.  The  final  transportation  matrix  gives  a  network 
flow  cost  of  40.  The  constants  in  the  energy  function  are  chosen  as  A=27, 
B=C=80,  and  D=0.2.  The  constant  x0  is  0.5.  See  Table  3  for  the  flow  matrices  at 
the  iteration  numbers  indicated  by  the  arrows. 

4.  Flow  cost  histogram  for  the  Hitchcock  problem.  The  number  of  samples  is  50000. 

5.  Second  example  of  the  Hitchcock  problem  using  a  binary  number  representation 
scheme.  Uniformly  fuzzy  states  initialized  the  network,  and  a  "softer"  non-linear 
function  was  used  to  give  the  best  solution,  with  a  flow  cost  of  49.  The  weights 
used  were  A=27,  B=C=80,  D=0.2.  The  constant  x0  was  1.0.  The  open  circle 
represents  a  solution  that  satisfied  the  constraints.  See  Table  4  for  flow  matrices 
at  the  iteration  numbers  indicated  by  the  arrows. 

6.  Network  dynamics  of  the  Hitchcock  problem  using  a  degenerate  (simple  sum) 
number  representation  scheme.  The  constants  used  were  A=29,  B=C- 80,  D- 0.55, 
and  x0=0.1.  Open  circles  again  represent  solutions  that  satisfy  the  constraints. 
Flow  matrices  corresponding  to  the  arrows  are  found  in  Table  5. 

7.  Continuation  of  the  degenerate  network.  One  of  the  two-in-50.000  best  solutions 
is  found  at  time  21.  Open  circles  represent  solutions  that  satisfy  the  constraints 
(i.e.  "consistent"  solutions).  The  cost  associated  with  the  solution  at  the  sixth 
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iteration  is  43,  that  associated  with  the  group  of  consistent  solutions  starting  at 
iteration  10  is  40,  and  that  associated  with  the  remaining  consistent  solutions  is 
38.  See  Table  6  for  the  corresponding  flow  matrices. 

8.  Neural  state  transitions  of  the  degenerate  (simple  sum)  Hitchcock  network  (Figs. 
6  and  7).  Iterations  21  through  28  are  shown. 

9.  Second  example  of  the  degenerate  (simple  sum)  Hitchcock  network.  A  random 
initial  state  drove  this  network  to  find  both  of  the  best  solutions.  The  two  flow 
matrices  are  shown  in  Table  7. 

10.  Inverse  DFT.  The  transition  mode  is  differential  asynchronous,  (a)  Unknown 
signal,  (b)  Known  Fourier  transform,  (c)  Uniformly  fuzzy  initial  states,  (d) 
Estimated  signal  after  2  intcrations. 

11.  Inverse  DFT,  second  example,  (a)  Unknown  asymmetric  signal,  (b)  Known 
Fourier  transform,  (c)  Random  initial  states,  (d)  Estimated  signal  after  2  iterations. 

Table  Captions 

1.  (a)  Cost  matrix  for  the  Hitchcock  problem,  (b)  Sample  solution  depicting  the  flow 
from  source  X  to  demand  Y. 

2.  Neural  representation  of  the  flow  matrix  for  the  Hitchcock  network  flow  problem, 
q  neurons  are  used  to  represent  one  element  of  the  flow  matrix. 

3.  Flow  matrices  for  the  specified  numbers  of  iterations, corresponding  to  the  points 
indicated  on  Fig.  3. 

4.  Flow  matrices  for  the  specified  numbers  of  iterations,  corresponding  to  points 
indicated  on  Fig.  5. 

5.  Flow  matrices  for  the  specified  numbers  of  iterations,  corresponding  to  the  points 
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indicated  on  Fig.  6. 

6.  Flow  matrices  for  the  specified  numbers  of  iterations,  corresponding  to  the  points 
indicated  in  Fig.  7. 


7.  Flow  matrices  for  the  specified  numbers  of  iterations,  corresponding  to  the  points 
indicated  on  Fig.  9. 
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Figure  Captions 

1.  Neural  network  model. 

2.  The  Hitchcock  Problem,  with  4  sources  and  5  demands. 

3.  Neural  dynamics  for  the  Hitchcock  problem,  using  a  binary  number  representation 
scheme.  The  initial  states  are  randomly  generated  from  a  seed,  and  the  transition 
mode  is  direct  asynchronous.  The  final  transportation  matrix  gives  a  network 
flow  cost  of  40.  The  constants  in  the  energy  function  are  chosen  as  A=27, 
B=C=80,  and  D=0.2.  The  constant  x0  is  0.5.  See  Table  3  for  the  flow  matrices  at 
the  iteration  numbers  indicated  by  the  arrows. 

4.  Flow  cost  histogram  for  the  Hitchcock  problem.  The  number  of  samples  is  5000. 

5.  Second  example  of  the  Hitchcock  problem  using  a  binary  number  representation 
scheme.  Uniformly  fuzzy  states  initialized  the  network,  and  a  "softer"  non-linear 
function  was  used  to  give  the  best  solution,  with  a  flow  cost  of  49.  The  weights 
used  were  A=27,  B=C=80,  D=0.2.  The  constant  *0  was  1.0.  The  open  circle 
represents  a  solution  that  satisfied  the  constraints.  See  Table  4  for  flow  matrices 
at. the  iteration  numbers  indicated  by  the  arrows. 

6.  Network  dynamics  of  the  Hitchcock  problem  using  a  degenerate  (simple  sum) 
number  representation  scheme.  The  constants  used  were  A=29,  D=C= 80,  £>=0.55, 
and  .r0=0.1.  Open  circles  again  represent  solutions  that  satisfy  the  constraints. 
Flow  matrices  corresponding  to  the  arrows  are  found  in  Table  5. 

7.  Continuation  of  the  degenerate  network.  One  of  the  two-in-50,000  best  solutions 
is  found  at  time  21.  Open  circles  represent  solutions  that  satisfy  the  constraints 
(i.e.  "consistent”  solutions).  The  cost  associated  with  the  solution  at  the  sixth 
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iteration  is  43,  that  associated  with  the  group  of  consistent  solutions  starting  at 
iteration  10  is  40,  and  that  associated  with  the  remaining  consistent  solutions  is 
38.  See  Table  6  for  the  corresponding  flow  matrices. 

8.  Neural  state  transitions  of  the  degenerate  (simple  sum)  Hitchcock  network  (Figs. 
6  and  7).  Iterations  21  through  28  are  shown. 

9.  Second  example  of  the  degenerate  (simple  sum)  Hitchcock  network.  A  random 
initial  state  drove  this  network  to  find  both  of  the  best  solutions.  The  two  flow 
matrices  are  shown  in  Table  7. 

10.  Inverse  DFT.  The  transition  mode  is  differential  asynchronous,  (a)  Unknown 
signal,  (b)  Known  Fourier  transform,  (c)  Uniformly  fuzzy  initial  states,  (d) 
Estimated  signal  after  2  interations. 

11.  Inverse  DFT,  second  example,  (a)  Unknown  asymmetric  signal,  (b)  Known 
Fourier  transform,  (c)  Random  initial  states,  (d)  Estimated  signal  after  2  iterations. 

Table  Captions 

1.  (a)  Cost  matrix  for  the  Hitchcock  problem,  (b)  Sample  solution  depicting  the  flow 
from  source  X  to  demand  Y. 

2.  Neural  representation  of  the  flow  matrix  for  the  Hitchcock  network  flow  problem, 
q  neurons  are  used  to  represent  one  element  of  the  flow  matrix. 

3.  Flow  matrices  for  the  specified  numbers  of  iterations.corresponding  to  the  points 
indicated  on  Fig.  3. 

4.  Flow  matrices  for  the  specified  numbers  of  iterations,  corresponding  to  points 
indicated  on  Fig.  5. 


5.  Flow  matrices  for  the  specified  numbers  of  iterations,  corresponding  to  the  points 
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